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MOLECULES FOR DL^GNOSTICS AND THERAPEUTICS 

TEOBMCAL FIELD 

5 The present invention relates to human molecules and to the use of these sequences in the 

diagnosis, study, prevention, and treatment of diseases associated with, as well as effects of 
exogenous compounds on, the expression of human molecules. 

BACKGROUND OF THE BWENTION 

10 The human genome comprises thousands of genes, many encoding gene products that 

function in the maintenance and growth of the various cells and tissues in the body. Aberrant 
expression or mutations in these genes and their products is the cause of, or is associated with, a 
variety of human diseases such as cancer and other cell proliferative disorders, 
autoimmune/inflammatory disorders, infections, developmental disorders, endocrine disorders, 

15 metabolic disorders, neurological disorders, gastrointestinal disorders, transport disorders, and 
connective tissue disorders. The identification of these genes and their products is the basis of an 
ever-expanding effort to find markers for early detection of diseases, and targets for their prevention 
and treatment Therefore, these genes and their products are useful as diagnostics and therapeutics. 
These genes may encode, for exanq)le, enzyme molecules, molecules associated with growth and 

20 development, biochemical pathway molecules, extracellular information transnoission molecules, 
receptor molecules, intracellular signaling molecules, membrane transport molecules, protein 
modification and maintenance molecules, nucleic acid synthesis and modification molecules, 
adhesion molecules, antigen recognition molecules, secreted and extracellular matrix molecules, 
cytoskeletal molecules, ribosomal molecules, electron transfer associated molecules, transcription 

2 5 factor molecules, chromatin molecules, cell membrane molecules, and organelle associated 

molecules. 

For example, cancer represents a type of cell proliferative disorder that affects nearly every 
tissue in the body. A wide variety of molecules, either aberrantly expressed or mutated, can be the 
cause of, or involved with, various cancers because tissue growth involves complex and ordered 

3 0 patterns of cell proliferation, cell differentiation, and apoptosis. Cell proliferation must be regulated 

to TnaiTitflin both the number of cells and their spatial organization. This regulation depends upon the 
appropriate expression of proteins which control cell cycle progression in response to extracellular 
signals such as growdi factors and other mitogens, and intracellular cues such as DNA damage or 
nutrient starvation. Molecules which directly or indirectly modulate cell cycle progression fall into 
3 5 several categories, including growth factors and their receptors, second messenger and signal 

transduction proteins, oncogene products, tumor-suppressor proteins, and mitosis-promoting factors. 
Aberrant expression or mutations in any of these gene products can result in cell proliferative 

1 
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^oncoeenesis). Oncoproteins, encoded by oncogenes, oaii 

genes and their products have been fonnd to be associatea vn y 
cancer, many more may exist that are yet to be discovered. 

,0 Enzyme Molecules ,,,.,3,dbiodegradationinvolvean«mber of key enzyme 
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the substrate binding domain. In addition to the well-known role in detoxification of ethanol, SCADs 

are also involved in synthesis and degradation of fatty acids, steroids, and some prostaglandins, and 

are therefore implicated in a variety of disorders such as lipid storage disease, myopathy, SCAD 

deficiency, and certain genetic disorders. For example, retinol dehydrogenase is a SCAD-family 

5 member (Simon, A. et al. (1995) J. Biol. Chem. 270: 1 107-1 1 12) that converts retinol to retinal, the 

precursor of retmoic acid. Retinoic acid, a regulator of differentiation and apoptosis, has been shown 

to down-regulate genes involved in cell proliferation and inflanoonation (Chai, X. et al. (1995) J. Biol. 

Qiem. 270:3900-3904). In addition, retinol dehydrogenase has been linked to hereditary eye diseases 

such as autosomal recessive childhood-onset severe retinal dystrophy (Simon, A. et al. (1996) 

10 Genomics 36:424-430). 

Propagation of nerve impulses, modulation of cell proliferation and differentiation, induction 
of the irmnune response, and tissue homeostasis involye neurotransmitter metabolism (Weiss, B. 
(1991) Neurotoxicology 12:379-386; Collins, S.M. et al. (1992) Ann. N.Y. Acad. Sci. 664:415-424; 
Brown, J.K. and H. Imam (1991) J. Inherit. Metab. Dis. 14:436-458). Many pathways of 

15 neurotransmitter metabolism require oxidoreductase activity, coupled to reduction or oxidation of a 
cofactor, such as NADWADH (Newshobne, E.A. and A.R. Leech (1983) Biochemistry for the 
Medical Sciences. John Wiley and Sons, Chichester, U.K. pp. 779-793). Degradation of 
catecholamines (epinephrine or norepinephrine) requires alcohol dehydrogenase (in the brain) or 
aldehyde dehydrogenase (in peripheral tissue). NAD* -dependent aldehyde dehydrogenase oxidizes 

20 5-hydroxyindole-3-acetate (the product of 5-hydroxytryptamine (serotonin) metabolism) in the brain, 
blood platelets, liver and pulmonary endothelium (Newsholme, supra, p. 786). Other 
neurotransmitter degradation pathways that utilize NAD^/NADH-dependent oxidoreductase activity 
include those of L-DOPA (precursor of dopamine, a neuronal excitatory compound), glycine (an 
inhibitory neurotransmitter in the brain and spinal cord), histamine (liberated from mast cells during 

25 the inflammatory response), and taurine (an inhibitory neurotransmitter of the bram stem, spinal cord 
and retina) (Newsholme, supra, pp. 790, 792). Epigenetic or genetic defects in neurotransmitter 
metaboUc pathways can result in a spectrum of disease states in dijfferent tissues including Parkinson 
disease and inherited myoclonus (McCance, K.L. and S.E. Huether (1994) Pathophvsiologv . Mosby- 
Year Book, Inc., St. Louis MO, pp. 402-404; Gundlach, AJL. (1990) FASEB J. 4:2761-2766). 

3 0 Tetrahydrof olate is a derivatized glutamate molecule that acts as a carrier, providing 

activated one-carbon units to a wide variety of biosynthetic reactions, including synthesis of purines, 
pyrimidines, and the amino acid methionine. Tetrahydrofolate is generated by the activity of a 
holoenzyme complex called tetrahydrofolate synthase, which includes three enzyme activities: 
tetrahydrofolate dehydrogenase, tetrahydrofolate cyclohydrolase, and tetrahydrofolate synthetase. 

3 5 Thus, tetrahydrofolate dehydrogenase plays an important role in generating building blocks for 
nucleic and amino acids, crucial to proliferating cells. 
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SnyKacyl-CoA dehydrogenase (3HACD) is involved in fatty add nietabolism. It 
catalyzes thereduction of 3-hydroxyacyl^oAto3«l-CoA.v^thconcor^^ 

NAD to NADH, in the mitochondria and peroxisomes of enkaryotic ceUs. In peroxisomes, 3HACD 
andenoyl-CoAhydrataseformanenzymecomplexcanedbif«nctionalenzyme.defectsm^^^^^ 

s associated withperoxisomalbifunctionalenzymedeficiency. This interruption in fatty acrd 
„.etaboUsmp«xiuces accumulation of very-long chainfetty acids, disrupting development of ttie 
brain, bone, andadrenal glands. Mants bon, with this deficiency typicaHy die within 6 mon^^s 
(Watldns P etal.(1989)J.ain. Invest 83:771-777:OnlineMendeIianInheritancemM 
#261515; Tl^eneurodegenerationthatischaracteristicofAlzheimer'sdiseaseinvolvesdevelopment 
.0 of extracellular plagues incertainbrainregions. Amajorproteincomponentof these plaques is the 
peptide amyloid-p (AP). whichis one of several cleavage products of amyloid precursor protem 
(APP) 3HACDhasbeenshowntobindtheAPpeptide.andisoverexpressedinneuronsaffectedm 
Alzhdmer'sdisea^ In addition, an antibody against 3HACDcanblockthe toxic effe^^ 
cenculturemodelofAlzheimer'sdisease(Yan.S.et^. (1997) Nature 389:^^^^^^^^ 

15 ««02O57). . 
S«»ds,s»base*ogeo.KS»a»o»,cMic«Bro»,aod<>a«n.™ 

^ «.d i.»ro»v«ed h*. m^- A variety of 
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b,toxy,«>M d.h,dr08«as», involved in h,p«t»si™. M>,, a.d ca«^^ 
DGh„sh(1997)S.en,,ds62:,5-l(»).On.s»hd.h,d„^en.s.is3-oxc-5-.-«,dde^^^^^^^ 

(OASD).an>icn»on«an«=topm.dnhigMye.Tre.s«iinp^^^ 

^onaive «»ne». OASD e«..y.es ^ c.nv.n»o„ of into dU,yd»„s.o.»™.. winch ,s 

tt,e MOS. pc«. »<»g=n. D,t,d»«s»«on. i, esscnd., fo, *c to^.ion of .i« maie P^notype 
d^ng «„b„og»»is, weii » for proper «.d,06e.-medi.>ed grow^ of .issues such as .he prosu.e 
,s ,„4™Icg™i*.Adefea,nOAa>U..tprev.n.s*econ,e,slouof.es.os.ero„em,o 

«h,d.«e..os«»oneleads.o.rarefonnofn.alepseudohennaph,odi.is,ch»c»ri»db,de^^^^ 
f^onottheex^^ genitalia <Anderss».S.e,ai.(199f)N..u,e354:159-16V.Ub,^F.e.^. 
(,992)Endoc,i«ology ,31:1571-1573; OM1M«264«)0). Thu., OASDpla,saoen,=drolem«.«.l 

diffeOTfimot and androgen physiology. 

176-hydrox,s.eroiddehydioge.ase(npHSD6)plays»,impor«ttroleinfteregutauonrf 

.h. mate r^ve honno.^, dihydro,.s.os.e,o.. (DmT). 17 pHSM a«s » reduce le-eh, of 
DHITbyoxidizingaprecursorofDffrT.3a^oi,».nd,os«r<>.ewhichis,«um,gl».roni^ 
a^dremovedlronrteues. 171!HSD6 is achvewid. bod, androgen ».des,rog»subs»«es.h«. 
exp«ssedinem.^ouickid„ey293cells. A. leas, five od«r isozymes of 17 pHSD have be«, 
3S i<ie„.ifiedd».cauly«oxidadonand/o,,educdon,eacdon.lnvarious6aaue,wifl.pr.f^^ 

^. x/io ^r>w Rm«Jl (1997) J. Biol. Chem. 272:15959-15966). 
different steroid substrates (Biswas. M.G. and D.W. RusseU {l^D J- « 
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For example. 17pHSDl preferentially reduces estradiol and is abundant in the ovary and placenta. 

17PHSD2 catalyzes oxidation of androgens and is present in the endometrium and placenta. 

17PHSD3 is exclusively a reductive enzyme in the testis (Geissler, W.M. et al. (1994) Nat Genet 

7:34-39). An excess of androgens such as DHTT can contribute to certain disease states such as 

5 benign prostatic hyperplasia and prostate cancer. 

Oxidoreductases are components of the fatty acid metabolism pathways in mitochondria and 

peroxisomes. The main beta-oxidation pathway degrades both saturated and unsaturated fatty acids, 

while the auxiliary pathway performs additional steps required for the degradation of unsaturated 

fatty acids. The auxiliary beta-oxidation enzyme 2,4-dienoyl-CoA reductase catalyzes the removal of 

10 even-mraibered double bonds from unsaturated fatty acids prior to their entry mto the main beta- 
oxidation pathway. The enzyme may also remove odd-nxraibered double bonds from unsaturated 
fatty acids (Koivuranta, K.T. et al. (1994) Biochem. J. 304:787-792; Snieland, T.E. et al. (1992) Proc. 
Natl. Acad. Sci. USA 89:6673-6677). 2,4-dienoyl-CtoA reductase is located in both mitochondria and 
peroxisomes. Inherited deficiencies in mitochondrial and peroxisomal beta-oxidation enzymes are 

15 associated with severe diseases, some of which manifest themselves soon after birth and lead to death 
within a few years. Defects in beta-oxidation are associated with Reye's syndrome, Zellweger 
syndrome, neonatal adrenoleukodystrophy, infantile Refsum's disease, acyl-CoA oxidase deficiency, 
and bifunctional protein deficiency (Suzuki, Y. et al. (1994) Am. J. Hum. Genet. 54:36-43; Hoefler, 
supra : Cotran, R.S. et al. (1994) Robbins Pathologic Basis of Disease . W.B. Saunders Co., 

20 Philadelphia PA, p.866). Peroxisomal beta-oxidation is impaired m cancerous tissue. Although 
neoplastic human breast epithelial cells have the same number of peroxisomes as do normal cells, 
fatty acyl-CoA oxidase activity is lower than in control tissue (el Bouhtoury, F. et al. (1992) J. Pathol. 
166:27-35). Human colon carcinomas have fewer peroxisomes than normal colon tissue and have 
lower fatty-acyl-CoA oxidase and bifunctional enzyme (including enoyl-CoA hydratase) activities 

25 than normal tissue (Cable, S. et al. (1992) Virchows Arch. B Cell Pathol. Incl. Mol. Pathol. 62:221- 
226). Another important oxidoreductase is isocitrate dehydrogenase, which catalyzes the conversion 
of isocitrate to a-ketoglutarate, a substrate of the citric acid cycle. Isocitrate dehydrogenase can be 
either NAD or NADP dependent, and is found in the cytosol, mitochondria, and peroxisomes. 
Activity of isocitrate dehydrogenase is regulated developmentally, and by hormones, 

3 0 neurotransmitters, and growtii factors. 

Hydroxypyruvate reductase (HPR), a peroxisomal 2-hydroxyacid dehydrogenase in the 
glycolate pathway, catalyzes the conversion of hydroxypyruvate to glycerate with the oxidation of 
both NADH and NADPH. The reverse dehydrogenase reaction reduces NAD* and NADF. HPR 
recycles nucleotides and bases back into pathways leading to the synthesis of ATP and GTP. ATP 

35 and GTP are used to produce DNA and RNA and to control various aspects of signal transduction and 
energy metabolism. Inhibitors of purine nucleotide biosynthesis have long been employed as 
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to particular sites on a type of substrate. Transferases participate in reactions essential to such 

functions as synthesis and degradation of cell components, regulation of cell functions including cell 
signaling, cell proliferation, inflamation, apoptosis, secretion and excretion. Transferases are 
involved in key steps in disease processes involving these functions. Transferases are frequentiy 
5 classified according to the type of group transfeired. For example, methyl transferases transfer one- 
carbon methyl groups, amino transferases transfer nitrogenous amino groups, and similarly 
denominated enzymes transfer aldehyde or ketone, acyl, glycosyl, alkyl or aryl, isoprenyl, sacchaiyl, 
phosphorous-containing, sulfur-containing, or seleniumrcontaining groups, as well as small 
enzymatic groups such as Coenzyme A. 

10 Acyl transferases include peroxisomal carnitine octanoyl transferase, which is involved in the 

fatty acid beta-oxidation pathway, and mitochondrial carnitine palmitoyl transferases, involved in 
fatty acid metabolism and transport. Choline 0-acetyl transferase catalyzes the biosynthesis of the 
neurotransmitter acetylcholine. 

Amino transferases play key roles in protein synthesis and degradation, and they contribute to 

15 other processes as well. For example, the amino transferase 5-aminolevulimc acid synthase catalyzes 
the addition of succinyl-CoA to glycine, the first step in heme biosynthesis. Other amino transferases 
participate in pathways inqportant for neurological function and metabolism. For example, 
glutamdne-phenylpyruvate amino transferase, also knovm as glutamine transaminase K (GTK), 
catalyzes several reactions with a pyridoxal phosphate cofactor. GTK catalyzes the reversible 

20 conversion of L-glutamine and phenylpyruvate to 2-oxoglutaramate and L-phenylalanine. Other 
amino acid substrates for GTK include L-methionine, L-histidine, and L-tyrosine. GTK also 
catalyzes the conversion of kynurenine to kynurenic acid, a tryptophan metabolite that is an 
antagonist of the N-methyl-D-aspartate (NMD A) receptor in the brain and may exert a 
neuromodulatory function. Alteration of the kynurenine metaboHc pathway may be associated with 

25 several neurological disorders. GTK also plays a role in the metabolism of halogenated xenobiotics 
conjugated to glutathione, leading to nephrotoxicity in rats and neurotoxicity in humans. GTK is 
expressed in kidney, liver, and brain. Both human and rat GTKs contain a putative pyridoxal 
phosphate binding site (ExPASy ENZYME: EC 2.6.1,64; Perry, S.J. et al. (1993) Mol. Pharmacol. 
43:660-665; Perry, S. et al. (1995) FEES Lett. 360:277-280; and Alberati-Giani, D. et al. (1995) J. 

3 0 Neurochem. 64: 1448-1455). A second amino transferase associated with this pathway is 

kynurenine/a-aminoadipate amino transferase (AadAT)- AadAT catalyzes the reversible conversion 
of a-aminoadipate and a-ketoglutarate to a-ketoadipate and L-glutanoate during lysine metabolism. 
AadAT also catalyzes the transamination of kynuraiine to kynurenic acid. A cytosolic AadAT is 
expressed in rat kidney, liver, and brain (Nakatani, Y. et al. (1970) Biochim. Biophys. Acta 198:219- 

35 228; BuchU, R. et al. (1995) J. Biol. Chem. 270:29330-29335). 

Glycosyl transferases include the mammalian UDP-glucouronosyl transferases, a fanuly of 



7 



PCTAJS2003/028227 



WO 2004/023973 « .=,t«lv7ine the transfer of glucouronic acid to Upophilic 

xnembtane-bomdnucrosoinalenrymescatalyzmgthetraii ^ Cogens, 

^ * 1^0 ;«riPtoxification and excretion of drugs, caicmusouo. 
substratesinreactionsthatpbyimportantrolesmdetoxiticanonan rmP ^.lactose- 

. . w»eth<.transf» of galactose to ceranudem the synthesis 01 
ceramide galactosyl transferase, catalyzes the transier 01 g 

. r .™i.r«nBs of the nervous system. TheUDP-glycosyl transferases 
5 galactocerebrosides in myelin membranes ot the nervo y „„^™co 
5 gaiactoc .Hon,ainofaboiit50aimnoacidresidues(PROSrrE:PDOC00359. 
share a conserved signature domam ot aooiu 3u «iuix 

. . ^,„t«,le for methvlation in cytokine receptor signalmgCLm,W. J. et 
interferon.suggestogananportantroleforme*ylati y ^,^^60-266; and 

aL (1996) J. Biol. Chem. 271: 15034-15044; Abramovich. C. et al. (1997) EMliU 
Q.«« H <? et al (1998) Genomics 48:330-340). 

We«s.. m ^Ph. of fame.,. 

Saccharyl transferases are glycating enzymes mvolved m a vanety of meta p 
Accumulation of these endproducts is observed m vascnlar comphcations 



wo 2004/023973 PCTAJS2003/028227 
disease, renal insufficiency, and Alzheimer's disease (Tliomalley, P.J. (1998) Cell MoL Biol. (Noisy- 

Le-Grand) 44:1013-1023). 

Coenzyme A (CoA) transferase catalyzes the transfer of CoA between two carboxylic acids. 

Succinyl CoA:3-oxoacid CoA transferase, for example, transfers CoA from succinyl-CoA to a 

5 recipient such as acetoacetate. Acetoacetate is essential to the metabolism of ketone bodies, which 

accumulate in tissues affected by metabolic disorders such as diabetes (PROSITE: PDOCXX)980). 

Hydrolases 

Hydrolysis is the breaking of a covalent bond in a substrate by introduction of a molecule of 
water. The reaction involves a nucleophilic attack by the water molecule's oxygen atom on a target 

10 bond in the substrate. The water molecule is split across the target bond, breaking the bond and 
generating two product molecules. Hydrolases participate in reactions essential to such functions as 
synthesis and degradation of cell components, and for regulation of cell functions including cell 
signaling, cell proliferation, inflamation, apoptosis, secretion and excretion. Hydrolases are involved 
in key steps in disease processes involving these functions. Hydrolytic enzymes, or hydrolases, may 

15 be grouped by substrate specificity into classes including phosphatases, peptidases, 
lysophospholipases, phosphodiesterases, glycosidases, and glyoxalases. 

Phosphatases hydrolytically remove phosphate groups from proteins, an energy-providing 
step that regulates many cellular processes, mcluding intracellular signaling pathways that in turn 
control cell growth and differentiation, cell-cell contact, the cell cycle, and oncogenesis. 

20 Lysophospholipases (LPLs) regulate intracellular lipids by catalyzing the hydrolysis of ester 

bonds to remove an acyl group, a key step in lipid degradation. Small LPL isoforms, approximately 
15-30 kD, function as hydrolases; larger isoforms function both as hydrolases and transacylases. A 
particular substrate for LPLs, lysophosphatidylcholine, causes lysis of cell membranes. LPL activity 
is regulated by signaling molecules inq)ortant in numerous pathways, including the inflammatory 

25 response. 

Peptidases, also called proteases, cleave peptide bonds that form the backbone of peptide or 
protein chaias. Proteolytic processing is essential to cell growth, differentiation, remodeling, and 
homeostasis as well as inflanunation and immune response. Since typical protein half-lives range 
from hours to a few days, peptidases are continually cleaving precursor proteins to their active form, 

30 removing signal sequences from targeted proteins, and degrading aged or defective protems. 

Peptidases function in bacterial, parasitic, and viral invasion and replication within a host Examples 
of peptidases include trypsin and chymotrypsin (components of the complement cascade and the 
blood-clotting cascade) lysosomal cathepsins, calpains, pepsin, renin, and chymosin (Beynon, R.J. 
and J.S. Bond (1994) Proteolvtic Enzvmes: A Practical Approach, Oxford University Press, New 

35 York NY, pp. 1-5). Proteolytic enzymes or proteases either activate or deactivate proteins by 

hydrolyzing peptide bonds. Proteases are found in the cytosol, in membrane-bound compartments, 
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amine-lyases (deaminases) and other lyases. 

Proper regulation of lyases is critical to normal physiology. For example, mutation induced 

deficiencies in the uroporphyrinogen decarboxylase can lead to photosensitive cutaneous lesions m 

the genetically-linked disorder familial porphyria cutanea tarda (Mendez, M. et al. (1998) Am. J. 

5 Genet. 63: 1363-1375). It has also been shown that adenosine deaminase (ADA) deficiency stems 

from genetic mutations in the ADA gene, resulting in the disorder severe conabined 

immunodeficiency disease (SCID) (Hershfield, M.S. (1998) Semin. HematoL 35:291-298). 

Isomerases 

Isomerases are a class of enzymes that catalyze geometric or structural changes within a 

10 molecule to form a single product. This class includes racemases and epimerases, cis-trans- 

isomerases, intramolecular oxidoreductases, intramolecular transferases (mutases) and intramolecular 
lyases. Isomerases are critical components of cellular biochemistry with roles in metabolic energy 
production including glycolysis, as well as other diverse enzymatic processes (Stryer, L. (1995) 
Biochemistrv. W.H. Freeman and Co., New York NY, pp.483-507). 

X5 Racemases are a subset of isomerases that catalyze inversion of a molecules configuration 

around the asymmetric carbon atom in a substrate having a single center of asynmietry, thereby 
inteiconverting two racemers. Epimerases are another subset of isomerases that catalyze inversion of 
configuration around an asymmetric carbon atom in a substrate with more than one center of 
symmetry, thereby interconverting two epimers. Racemases and epimerases can act on amino acids 

20 and derivatives, hydroxy acids and derivatives, as well as carbohydrates and derivatives. The 

interconversion of UDP-galactose and UDP-glucose is catalyzed by UDP-galactose-4'-epimerase. 
Proper regulation and function of this epimerase is essential to the synthesis of glycoproteins and 
glycolipids. Elevated blood galactose levels have been correlated with UDP-galactose-4* -epimerase 
deficiency in screenmg programs of infants (Gitzelmann, R. (1972) Helv. Paediat. Acta 27:125-130). 

25 Oxidoreductases can be isomerases as well. Oxidoreductases catalyize the reversible transfer 

of electrons fi'om a substrate that becomes oxidized to a substrate that becomes reduced. This class 
of enzymes includes dehydrogenases, hydroxylases, oxidases, oxygenases, peroxidases, and 
reductases. Proper maintenance of oxidoreductase levels is physiologically important. For example, 
genetically-linked deficiencies in lipoamide dehydrogenase can result in lactic acidosis (Robinson, 

30 B.H. et al. (1977) PediaL Res. 11:1198-1202). 

Another subgroup of isomerases are the transferases (or mutases). Transferases transfer a 
chemical group firom one compound (the donor) to another compound (the acceptor). The types of 
groups transferred by these enzymes include acyl groups, amino groups, phosphate groups 
(phosphotransferases or phosphomutases), and others. The transferase carnitine pahnitoyltransferase 

35 is an important component of fatty acid metabolism Genetically-linked deficiencies in this 

transferase can lead to myopathy (Scriver, C.R. et al. (1995) The Metabolic and Molecular Basis of 
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succinyl, and propionyl moieties, collectively referred to as acyl groups. Other carbon substrates 

include enoyl lipid, which acts as a fatty acid oxidation intermediate, and carnitine, which acts as an 

acetyl-CoA flux regulator/ mitochondrial acyl group transfer protein. Acyl-CoA and acetyl-CoA are 

synfliesized in the cell by acyl-CoA synthetase and acetyl-CoA synthetase, respectively. 

5 Activation of fatty acids is mediated by at least three forms of acyl-CoA synthetase activity: 

i) acetyl-CoA synthetase, which activates acetate and several other low molecular weight carboxylic 

acids and is found in muscle mitochondria and the cytosol of other tissues; ii) medium-chain 

acyl-CoA synthetase, which activates fatty acids containing between four and eleven carbon atoms 

(predominantly from dietary sources), and is present only in liver mitochondria; and iii) , which is 

10 specific for long chain fatty acids with between six and twenty carbon atoms, and is found in 

microsomes and the mitochondria. Proteins associated with acyl-CoA synthetase activity have been 

identified from many sources including bacteria, yeast, plants, mouse, and man. The activity of 

acyl-CoA synthetase may be modulated by phosphorylation of the enzyme by cAMP-dependent 

protein kinase. 

15 Ligases forming carbon-nitrogen bonds include amide synthases such as glutamine 

synthetase (glutamate-ammonia ligase) that catalyzes the amiuation of glutamic acid to glutamine by 
ammonia using the enei^y of ATP hydrolysis. Glutamine is the primary source for the amino group 
in various amide transfer reactions involved in de novo pyrimidine nucleotide synthesis and in purine 
and pyrimidine ribonucleotide intercbnversions. Overexpression of glutamine synthetase has been 

20 observed in primary liver cancer (Christa, L. et al (1994) Gastroent. 106: 13 12-1320). 

Acid-amino-acid ligases (peptide synthases) are represented by the ubiquitin proteases which 
are associated with the ubiquitin conjugation system (UCS), a major pathway for the degradation of 
cellular proteins in eukaryotic cells and some bacteria. The UCS mediates the elimination of 
abnormal proteins and regulates the half-lives of important regulatory proteins that control cellular 

25 processes such as gene transcription and cell cycle progression. In the UCS pathway, proteins 
targeted for degradation are conjugated to a ubiquitin (Ub), a small heat stable protein. Ub is first 
activated by a ubiquitin-activating enzyme (El), and then transferred to one of several Ub- 
conjugating enzymes (E2). E2 then links the Ub molecule through its C-tenninal glycine to an 
internal lysine (acceptor lysine) of a target protein. The ubiquitinated protein is then recognized and 

30 degraded by proteasome, a large, muWsubunit proteolytic enzyme complex, and ubiquitin is released 
for reutilization by ubiquitin protease. The UCS is implicated in the degradation of mitotic cyclic 
kinases, oncoproteins, tumor suppressor genes such as p53, viral proteins, cell surface receptors 
associated with signal transduction, transcriptional regulators, and mutated or damaged proteins 
(Ciechanover, A. (1994) Cell 79: 13-21). A murine proto-oncogene, Unp, encodes a nuclear ubiquitin 

35 protease whose overexpression leads to oncogenic transformation of NIH3T3 cells, and the human 
homolog of this gene is consistently elevated in small cell tumors and adenocarcinomas of the lung 
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of pyruvate to oxaloacetate, a key intermediate in the citric acid cycle. 

ligases forming phosphoric ester bonds include the DNA ligases involved in both DNA 

replication and repair. DNA ligases seal phosphodiester bonds between two adjacent nucleotides m a 

DNA chain using the energy from ATP hydrolysis to &st activate the free 5' -phosphate of one 

5 nucleotide and then react it with the 3' -OH group of the adjacent nucleotide. This resealing reaction 

is used in both DNA replication to jom small DNA fragments called Okazaki fragments that are 

transiently formed in the process of replicating new DNA, and in DNA repair. DNA repair is the 

process by which accidental base changes, such as those produced by oxidative damage, hydrolytic 

attack, or uncontrolled methylation of DNA, are corrected before replication or transcription of the 

10 DNA can occur. Bloom's syndrome is an inherited human disease in which individuals are partially 
deficient Lu DNA ligation and consequently have an increased incidence of cancer (Alberts, B. et al. 
(1994) The Molecular Biology of the CeH , Garland Publishing Lie, New York NY, p. 247). 
Molecules Associated with Growth and Development 

Human growth and development requires the spatial and temporal regulation of cell 

15 differentiation, cell proliferation, and apoptosis. These processes coordinately control reproduction, 
aging, embryogenesis, morphogenesis, organogenesis, and tissue repair and maintenance. At the 
cellular level, growth and development is governed by the cell's decision to enter into or exit from 
the cell division cycle and by the cell's commitment to a terminally differentiated state. These 
decisions are made by the cell in response to extracellular signals and other environmental cues it 

20 receives. The following discussion focuses on the molecular mechanisms of cell division, 
reproduction, cell differentiation and proliferation, apoptosis, and aging. 
Cell Division 

Cell division is the fundamental process by which all living things grow and reproduce. In 
unicellular organisms such as yeast and bacteria, each cell division doubles the number of organisms, 

25 while in multicellular species many rounds of cell division are required to replace cells lost by wear 
or by programmed cell death, and for cell differentiation to produce a new tissue or organ. Details of 
the cell division cycle may vary, but the basic process consists of three principle events. The first 
event, interphase, involves preparations for cell division, replication of the DNA, and production of 
essential proteins. In the second event, mitosis, the nuclear material is divided and separates to 

30 opposite sides of the cell. The final event, cytokinesis, is division and fission of the cell cytoplasm. 
The sequence and timing of cell cycle transitions is under the control of the cell cycle regulation 
system which controls the process by positive or negative regulatory circuits at various check points. 

Regulated progression of the cell cycle depends on the integration of growth control 
pathways with the basic cell cycle machinery. Cell cycle regulators have been identified by selectmg 

35 for human and yeast cDNAs that block or activate cell cycle arrest signals in the yeast mating 

pheromone pathway when they are overexpressed. Known regulators mclude human CPR (cell cycle 
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other tissues coordinate reproduction and lactation. These signals vary during the monthly 

menstruation cycle and during the female's lifetime. Similarly, the sensitivity of reproductive organs 

to these endocrine signals varies during the female's lifetime. 

A combination of positive and negative feedback to the ovaries, pituitary and hypothalamus 

5 glands controls physiologic changes during the monthly ovulation and endometrial cycles. The 

anterior pituitary secretes two major gonadotropin hormones, follicle-stimulating hormone (FSH) and 

luteinizing hormone (LH), regulated by negative feedback of steroids, most notably by ovarian 

estradiol. Iffertilization does not occur, estrogen and progesterone levels decrease. This sudden 

reduction of the ovarian hormones leads to menstruation, the desquamation of the endometrium. 

10 Hormones further govern all the steps of pregnancy, parturition, lactation, and menopause. 

During pregnancy large quantities of human chorionic gonadotropin (hCG), estrogens, progesterone, 
and human chorionic somatomammotropin (hCS) are foimed by the placenta. hCG, a glycoprotein 
similar to luteinizmg hormone, stimulates the corpus luteum to continue producing more progesterone 
and estrogens, rather than to involute as occurs if the ovum is not fertilized. hCS is similar to growth 

15 hormone and is crucial for fetal nutrition. 

The female breast also matures during pregnancy. Large amounts of estrogen secreted by the 
placenta trigger growth and branching of the breast milk ductal system while lactation is initiated by 
the secretion of prolactin by the pituitary gland. 

Parturition involves several hormonal changes that increase uterine contractility toward the 

20 end of pregnancy, as follows. The levels of estrogens increase more than those of progesterone. 
Oxytocin is secreted by the neurohypophysis. Concomitantly, uterine sensitivity to oxytocin 
increases. The fetus itself secretes oxytocin, Cortisol (from adrenal glands), and prostaglandins. 

Menopause occurs when most of the ovarian follicles have degenerated. The ovary then 
produces less estradiol, reducing the negative feedback on the pituitary and hypothalamus glands. 

25 Mean levels of circulating FSH and LH increase, even as ovulatory cycles continue. Therefore, the 
ovary is less responsive to gonadotropins, and there is an increase in the time between menstrual 
cycles. Consequently, menstmal bleeding ceases and reproductive capability ends. 
Cell Differentiation and Prohferation 

Tissue growth involves complex and ordered patterns of cell proliferation, cell 

30 differentiation, and apoptosis. Cell proliferation must be regulated to maintain both the number of 
cells and their spatial organization. This regulation depends upon the appropriate expression of 
proteins which control cell cycle progression in response to extracellular signals, such as growth 
factors and other mitogens, and intracellular cues, such as DNA damage or nutrient starvation. 
Molecules which directly or indirectly modulate cell cycle progression fail into several categories, 

35 including growth factors and their receptors, second messenger and signal transduction proteins, 
oncogene products, tumor-suppressor proteins, and mitosis-promoting factors. 
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can be stimulated by growth factors. For example, TGF-P stimulates fibroblasts to produce a variety 

of ECM proteins, including fibronectin, collagen, and tenascin (Pearson, C.A. et al. (1988) EMBO J. 

7:2677-2981). In fact, for some cell types specific ECM molecules, such as laminin or fibronectin, 

may act as growth factors. Tenascin-C and ~R, expressed in developing and lesioned neural tissue, 

5 provide stimulatory/anti-adhesive or inhibitory properties, respectively, for axonal growth (Faissner, 

A. (1997) CeU Tissue Res. 290:331-341). 

Oncoproteins 

Cancer represents a type of cell proliferative disorder that affects nearly eveiy tissue in the 
body. A wide variety of molecules, either aberrantly expressed or mutated, can be the cause of, or 

10 involved with, various cancers because tissue growth involves complex and ordered patterns of cell 
proliferation, ceU differentiation, and apoptosis. Cell proliferation must be regulated to maintain both 
the number of cells and their spatial organization. This regulation depends upon the appropriate 
expression of proteins which control cell cycle progression in response to extracellular signals such as 
growth factors and other mitogens, and intracellular cues such as DNA damage or nutrient starvation. 

15 Aberrant expression or mutations in any of these gene products can result in cell proliferative 
disorders such as cancer. Oncogenes are genes generally derived iSrom normal genes that, through 
abnormal expression or mutation, can effect the transformation of a normal cell to a malignant one 
(oncogenesis). 

Oncoproteins, encoded by oncogenes, can affect cell proliferation in a variety of ways and 

2 0 include growth factors, growth factor receptors, intracellular signal transducers, nuclear transcription 
factors, and cell-cycle control proteins. Molecules which directly or indirectiy modulate cell cycle 
progression fall into several categories, including growth factors and their receptors, second 
messenger and signal transduction proteins, oncogene products, tumor-suppressor proteins, and 
mitosis-promoting factors. In contrast, tumor-suppressor genes are involved in inhibiting cell 

25 proliferation. Mutations which cause reduced function or loss of function in tumor-suppressor genes 
result in aberrant cell proliferation and cancer. Although many different genes and their products 
have been found to be associated with cell proliferative disorders such as cancer, many more may 
^ exist that are yet to be discovered. 

Some oncoproteins are mutant isofonns of the normal protein, and other oncoproteins are 

30 abnormally expressed with respect to location or amount of expression. Many oncogenes have been 
identified and characterized. These include sis, erbA, erbB, her-2, mutated G^, src, abl, ras, crk, jun, 
fos, myc, and mutated tumor-suppressor genes such as RB, p53, mdm2, Cipl, pl6, and cyclin D. 
Transformation of normal genes to oncogenes may also occur by chromosomal translocation. The 
Philadelphia chromosome, characteristic of chronic myeloid leukemia and a subset of acute 

35 lymphoblastic leukemias, results from a reciprocal translocation between chromosomes 9 and 22 that 
moves a truncated portion of tiie proto-oncogene c-abl to the breakpoint cluster region (bcr) on 
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further molecular changes occur in the mitochondria of agmg cells through deterioration of structure. 

These changes eventually contribute to decreased function in every organ of the body. 

Biochemical Pathway Molecules 

Biochemical pathways are responsible for regulating metabolism, growth and development, 

5 protem secretion and trafficking, environmental responses, and ecological iuteractions mcluding 

immune response and response to parasites. 

DNA replication 

Deoxyribonucleic acid (DNA), the genetic material, is found in both the nucleus and 
mitochondria of human cells. The bulk of human DNA is nuclear, m the form of linear 

10 chromosomes, while mitochondrial DNA is circular. DNA replication begms at specific sites called 
origins of replication. Bidirectional synthesis occurs from the origm via two growing forks that move 
in opposite directions. Replication is semi-conservative, with each daughter duplex containing one 
old strand and its newly synthesized complementary partner. Proteins involved in DNA replication 
include DNA polymerases, DNA primase, telomerase, DNA helicase, topoisomerases, DNA ligases, 

15 replication factors, and DNA-binding proteins. 
DNA Recombination and Repair 

Cells are constantiy faced with replication errors and environmental assault (such as 
ultraviolet irradiation) that can produce DNA damage. Damage to DNA consists of any change that 
modifies the structure of the molecule. Changes to DNA can be divided into two general classes, 

20 single base changes and structural distortions. Any damage to DNA can produce a mutation, and the 
mutation may produce a disorder, such as cancer. 

Changes in DNA are recognized by repair systems within the cell. These repair systems act 
to correct the damage and thus prevent any deleterious affects of a mutational event. Repair systems 
can be divided into three general types, direct repair, excision repair, and retrieval systems. Proteins 

25 involved in DNA repair include DNA polymerase, excision repair proteins, excision and cross link 
repair proteins, recombination and repair proteins, RAD51 proteins, and BLN and WRN proteins that 
are homologs of RecQ helicase. When the repair systems are eliminated, cells become exceedingly 
sensitive to environmental mutagens, such as ultraviolet irradiation. Patients with disorders 
associated with a loss in DNA repair systems often exhibit a high sensitivity to environmental 

30 mutagens. Examples of such disorders include xeroderma pigmentosum (XP), Bloom's syndrome 
(BS), and Werner's syndrome (WS) (Yamagata, K. et al. (1998) Proc. NaU, Acad. Sci. USA 95:8733- 
8738), ataxia telangiectasia, Cockayne's syndrome, andFanconi's anemia. 

Recombination is the process whereby new DNA sequences are generated by the movements 
of large pieces of DNA. hi homologous recombination, which occurs during meiosis and DNA 

35 repair, parent DNA duplexes align at regions of sequence similarity, and new DNA molecules form 
by the breakage and joining of homologous segments. Proteins involved include RAD51 

21 
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(1998) ain. Exp. Rheumatol. 16:317-326). Some examples of hnRNPs include the yeast proteins 

aplp, involved in cleavage and polyadenylation at the 3' end of the RNA; CbpSOp, involved in 

capping the 5'end of the RNA; and Npl3p, a homolog of mammalian hnRNP Al, involved in export 

of mRNA from the nucleus (Shen, E.C. et al. (1998) Genes Dev. 12:679-691). HnRNPs have been 

5 shown to be important targets of the autoimmune response in rheumatic diseases (Biamonti, supra) . 

Many snRNP proteins, hnRNP proteins, and alternative splicing factors are characterized by 

an RNA recognition motif (RRM). (Reviewed in Bkney, E. et al. (1993) Nucleic Acids Res. 

21:5803-5816.) The RRM is about 80 amino acids in length and forms four P-strands and two a- 

helices arranged in an a/p sandwich. The RRM contains a core RNP-1 octapeptide motif along with 

1 0 surrounding conserved sequences. 
RNA Stability and Degradation 

RNA helicases alter and regulate RNA conformation and secondary structure by using energy 
derived from ATP hydrolysis to destabilize and unwind RNA duplexes. The most well-characterized 
and ubiquitous family of RNA helicases is the DEAD-box fanaily, so named for the conserved B-type 

15 ATP-binding motif which is diagnostic of proteins in this family. Over 40 DEAD-box helicases have 
been identified in organisms as diverse as bacteria, insects, yeast, amphibians, mammals, and plants. 
DEAD-^box helicases function in diverse processes such as translation initiation, splicing, nbosome 
assembly, and RNA editing, transport, and stability. Some DEAD-box helicases play tissue- and 
stage-specific roles in spermatogenesis and embryogenesis. (Reviewed in Linder, P. et al. (1989) 

20 Nature 337: 121-122.) 

Overexpression of the DEAD-box 1 protein (DDXl) may play a role in the progression of 
neuroblastoma (Nb) and retinoblastoma (Rb) tumors. Other DEAD-box helicases have been 
implicated either directly or indirectly in ultraviolet light-induced tumors, B cell lymphoma, and 
myeloid malignancies. (Reviewed in Godbout, R. et al. (1998) J. Biol. Chem. 273:21161-21168.) 

25 Ribonucleases (RNases) catalyze the hydrolysis of phosphodiester bonds in RNA chains, thus 

cleaving the RNA. For example, RNase P is a ribonucleoprotein enzyme which cleaves the 5* end of 
pre-tRNAs as part of their maturation process. RNase H digests the RNA strand of an RNA/DNA 
hybrid. Such hybrids occur in cells invaded by retroviruses, and RNase H is an important enzyme in 
the retroviral replication cycle. RNase H domains are often found as a domain associated with 

30 reverse transcriptases. RNase activity in serum and cell extracts is elevated in a variety of cancers 
and mfectious diseases (Schein, C.H. (1997) Nat. Biotechnol. 15:529-536). Regulation of RNase 
activity is being investigated as a means to control tumor angiogenesis, allergic reactions, viral 
infection and replication, and fungal infections. 
Protein Translation 

35 The eukaryotic ribosome is composed of a 60S (large) subunit and a 40S (small) subunit, 

which together form the SOS ribosome. hi addition to the 18S, 28S, 5S, and 5.8S rRNAs, the 
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cap, eIF4A is a bidiiectional RNA-dependent helicase, and eIF4G is a scaffolding polypeptide. 

eIF4G has three binding domains. The N-teraainal third of eIF4G interacts with eIF4E, the central 

third interacts with e]F4A, and the C-tenninal third interacts with eIF3 bound to the 43S preinitiation 

complex. Thus, eIF4G acts as a bridge between the 40S ribosonial subunit and the mRNA (Hentze, 

•5 M.W. (1997) Science 275:500-501). 

The ability of eIF4F to initiate binding of the 43S premitiation complex is regulated by 

structural features of the mRNA. The mRNA molecule has an untranslated region (UTR) between 

the 5' cap and the AUG start codon. In some noRNAs this region forms secondary structures that 

impede bmding of the 43S preinitiation complex. The helicase activity of eIF4A is thought to 

10 function in removing this secondary structure to facilitate binding of the 43S preinitiation complex 
(Pain, supra) . . 
Translation Elongation 

Elongation is the process whereby additional amino acids are joined to the initiator 
methionine to form the conq)lete polypeptide chain. The elongation factors EFla, EFip y, and EF2 

15 are involved in elongating the polypeptide chain following initiation. EFla is a GTP-binding protein. 
Tin EFla's OTP-bound form, it brings an aminoacyl-tRNA to the ribosome's A site. The amino acid 
attached to the newly arrived aminoacyl-tRNA forms a peptide bond with the initiator methionine. 
The GTP on EFla is hydrolyzed to GDP, and EFla-GDP dissociates from the ribosome. EFip y 
binds EFla -GDP and induces the dissociation of GDP from EFla, allowing EFla to bind GTP and a 

2 0 new cycle to begin. 

As subsequent aminoacyl-tRNAs are brought to the ribosome, EF-G, another OTP-binding 
protein, catalyzes the translocation of tRNAs from the A site to the P site and finally to the E site of 
the ribosome. This allows the processivity of translation. 
Translation Termination 

25 The release factor eRF carries out termination of translation. eRF recognizes stop codons in 

the mRNA, leading to the release of the polypeptide chain from the ribosome. 
Post-Translational Pathways 

Proteins may be modified after translation by the addition of phosphate, sugar, prenyl, fatty 
acid, and other chemical groups. These modifications are often required for proper protein activity. 

30 Enzymes involved in post-translational modification include kinases, phosphatases, 

glycosyltrausferases, and prenyltransferases. The conformation of proteins may also be modified 
after translation by the introduction and rearrangement of disulfide bonds (rearrangement catalyzed 
by protein disulfide isomerase), the isomerization of proline sidechains by prolyl isomerase, and by 
interactions with molecular chaperone proteins. 

35 Proteins may also be cleaved by proteases. Such cleavage may result in activation, 

inactivation, or complete degradation of the protein. Proteases include serine proteases, cysteine 

25 
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Fatty acids are long-chain organic acids with a single carboxyl group and a long non-polar 

hydrocarbon tail. Long-^hain fatty acids are essential components of glycolipids, phospholipids, and 

cholesterol, which are building blocks for biological membranes, and of triglycerides, which are 

biological fuel molecules. Long-chain fatty acids are also substrates for eicosanoid production, and 

5 are important in the functional modification of certain complex carbohydrates and proteins. 16- 

carbon and 18-<:arbon fatty acids are the most common. 

Fatty acid synthesis occurs in the cytoplasm. In the first step, acetyl-Coenzyme A (CoA) 

carboxylase (ACC) synthesizes malonyl-CoA from acetyl-CoA and bicarbonate. The enzymes which 

catalyze the remaining reactions are covalently linked into a single polypeptide chain, referred to as 

10 the multifunctional enzyme fatty acid synthase (FAS). FAS catalyzes the synthesis of palmitate from 
acetyl-CoA and malonyl-CoA. FAS contains acetyl transferase, malonyl transferase, P-ketoacetyl 
synthase, acyl carrier protein, p-ketoacyl reductase, dehydratase, enoyl reductase, and thioesterase 
activities. The final product of the FAS reaction is the 16-carbon fatty acid palmitate. Further 
elongation, as well as unsaturation, of palmitate by accessory enzymes of the ER produces the variety 

15 of long chain fatty acids required by the individual cell. These enzymes include a NADH-cytochrome 
bg reductase, cytochrome bj, and a desaturase. 
Phospholipid and Triacylglvcerol Synthesis 

Triacylglycerols, also known as triglycerides and neutral fats, are major energy stores in 
animals. Triacylglycerols are esters of glycerol with three fatty acid chains. Glycerol-3-phosphate is 

20 produced from dihydroxy acetone phosphate by the enzyme glycerol phosphate dehydrogenase or 
from glycerol by glycerol kinase. Fatty acid-CoA's are produced from fatty acids by fatty acyl-CoA 
synthetases. Glyercol-3-phosphate is acylated with two fatty acyl-CoA's by the enzyme glycerol 
phosphate acyltransferase to give phosphatidate. Phosphatidate phosphatase converts phosphatidate 
to diacylglycerol, which is subsequently acylated to a triacylglyercol by the enzyme diglyceride 

25 acyltransferase. Phosphatidate phosphatase and diglyceride acyltransferase form a triacylglyerol 
synthetase complex bound to the ER membrane. 

A major class of phospholipids are the phosphoglycerides, which are composed of a glycerol 
backbone, two fatty acid chains, and a phosphorylated alcohol. Phosphoglycerides are components of 
cell membranes. Principal phosphoglycerides are phosphatidyl choline, phosphatidyl ethanolaraine, 

30 phosphatidyl serine, phosphatidyl inositol, and diphosphatidyl glycerol. Many enzymes involved in 
phosphoglyceride synthesis are associated with membranes (Meyers, R.A. (1995) Molecular Biology 
and Biotechnology . VCH Publishers Inc., New York NY, pp. 494-501). Phosphatidate is converted to 
CDP-diacylglycerol by the enzyme phosphatidate cytidylyltransferase (ExPASy ENZYME EC 
2.7.7.41). Transfer of the diacylglycerol group from CDP-diacylglycerol to serine to yield 

35 phosphatidyl serine, or to inositol to yield phosphatidyl inositol, is catalyzed by the enzymes CDP- 
diacylglycerol-serine 0-phosphatidyltransferase and CDP-diacylglycerol-inositol 3- 
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Cholesterol is used in the synthesis of steroid hormones such as Cortisol, progesterone, 
aldosterone, estrogen, and testosterone. First, cholesterol is converted to pregnenolone by 
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retinal, dolicbol phosphate (a carrier of oligosaccharides needed for N-Unked glycosylation), and 

famesyl and geranylgeranyl groups that modify proteins. Enzymes involved include famesyl 

transferase, polypienyl transferases, dolichyl phosphatase, and dolichyl kinase. 

Sphingfolipid Metabolism 

5 Sphingolipids are an important class of membrane lipids that contain sphingosine, a long 

cham ammo alcohol. They are composed of one long-chain fatty acid, one polar head alcohol, and 

sphingosine or sphingosine derivative. The three classes of sphingolipids are sphingomyelins, 

cerebrosides, and gangliosides. Sphingomyelins, which contain phosphocholine or 

phosphoethanolamine as their head group, are abundant in the myelin sheath surrounding nerve cells. 

10 Galactocerebrosides, which contain a glucose or galactose head group, are characteristic of die brain. 
Other cerebrosides are found in nonneural tissues. Gangliosides, whose head groups contain multiple 
sugar units, are abundant in the bram, but are also found in nonneural tissues. 

Sphingolipids are built on a sphingosine backbone. Sphingosine is acylated to ceramide by 
the enzyme sphingosine acetyltransferase. Ceramide and phosphatidyl choline are converted to 

15 sphingomyelin by the enzyme ceramide choline phosphotransferase. Cerebrosides are synthesized by 
the linkage of glucose or galactose to ceramide by a transferase. Sequential addition of sugar 
residues to ceramide by transferase enzymes yields gangliosides. 
Eicosanoid Metabolism 

Eicosanoids, including prostaglandins, prostacyclin, thromboxanes, and leukotrienes, are 20- 

2 0 carbon molecules derived from fatty acids. Eicosanoids are signaling molecules which have roles in 
pain, fever, and inflammation. The precursor of all eicosanoids is arachidonate, which is generated 
from phospholipids by phospholipase and from diacylglycerols by diacylglycerol lipase. 
Leukotrienes are produced from arachidonate by the action of lipoxygenases. Prostaglandin synthase, 
reductases, and isomerases are responsible for the synthesis of the prostaglandins. Prostaglandins 

25 have roles in inflammation, blood flow, ion transport, synaptic transmission, and sleep. Prostacyclin 
and the thromboxanes are derived from a precursor prostaglandin by the action of prostacyclin 
synthase and thromboxane synthases, respectively. 
Ketone Body Metabolism 

Pairs of acetyl-CoA molecules derived from fatty acid oxidation in the liver can condense to 

30 form acetoacetyl-CoA, which subsequently forms acetoacetate, D-3-hydroxybutyrate, and acetone. 
These three products are known as ketone bodies. Enzymes involved in ketone body metabolism 
include HMG-CoA synthetase, HMG-CoA cleavage enzyme, D-3-hydroxybutyrate dehydrogenase, 
acetoacetate decarboxylase, and 3-ketoacyl-CoA transferase. Ketone bodies are a normal fuel supply 
of the heart and renal cortex. Acetoacetate produced by the hver is transported to cells where the 

35 acetoacetate is converted back to acetyl-CoA and enters the citric acid cycle. In times of starvation, 
ketone bodies produced from stored triacylglyerols become an important fuel source, especially for 
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estersinthebloodaxetransportedinUpoprotemparticles. The particles consist of a core of 
hydrophobicHpidssurroundedbyasheUofpolarUpidsandapolipoproteins.Theprote.co.^^^^^ 
servein thesolubilizationofhydrophobicUpidsanddso contain cen-targeting signals. L^^^^^^^^ 
includechylon.crons.chylondcronren.ants,very.o.^ensitylipoprot 

density lipoproteins (IDL). low^sity lipoproteins (U)L). and high-density lipoproteins (HDL). 
Tl.ereisastronginversecorrelationbetweenthelevelsofplasn.HDLandriskofpren.t^ 

coronary heart disease. ,1, 
Tri^ylgl^M. ohyloMc^m and VLDL are hy<irol,»d b, lipoproteo, upases am 

W^dves^ls in ■«.lea.dod,c,dss„esfl». use f.«, acids. Cel. surfaoeU>L,ec.p»s«tDL 
p»ae,es .bic..»*e.in,«n.li»d by endocyiosis. Absent of^eLDL,ecep,o.«» cause. «^ 
Lse«.lhype.«es.e,o,.^.,ieads.oi.creasedp,.s™cbo,es«o,Uv^a„^ 
^to>sis. P,as»acholes,e,yles...Werpt«ei„,«dia»s*e«usfer,^choles»,les«^ 
fto»>K>L».po.ipop.o<ei=B^«.ip«prc-i- Cho,es,»,.es«,»usf„p««m.u^ 
i. Kvc«e cholesKK,. uans^r, system and u»y play a role i. aften^sclerods CY— i, S. « 
,1997)OM Opi».Upidol. 8:101-110). Macrophage scavenger .eceptors, which tod and »«mabie 
™dif,ed lipopro^ins. play a role in Upid »>spo« and may cou«ib«e » aa«oscle»sls (G.«.v.s. 
D R. et al. (1998) Curr. Opin. Lipidol. 9:425-432). 
3 0 " Proteins involved in cholesterol uptake and biosynthesis are tightly regulated in response to 

cellular cholesterol levels, m sterol regulatory element binding protein (SREBP) is a sterol- 
responsivetranscriptionfactor. Under norn^cholesterol conditions. SREBPresidesin the 

niembrane. When cholesterollevelsarelow.aregulatedcleavag. of SREBP occurs whichreleas^^ 
theextracellulardon^oftheprotein. TMs cleaved don«inis then transported tothen 
3s it activates the transcription of theLDL receptor gene, and genes encoding enzymes of cholesterol 
synthesis, by binding the sterol regulatory element (SRE) upstream of the genes (Yang. J. et al. 
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(1995) J. BioL Chem. 270:12152-12161). Regulation of cholesterol uptake and biosynthesis also 

occurs via the oxysterol-binding protein (OSBP). OSBP is a high-affinity intracellular receptor for a 

variety of oxysterols that down-regulate cholesterol synthesis and stimulate cholesterol esterification 

(Lagace, T.A, et al. (1997) Biochem. J. 326:205-213). 

5 Beta-oxidation 

Mitochondrial and peroxisomal beta-oxidation enzymes degrade saturated and unsaturated 
fatty acids by sequential removal of two-carbon units from CoA-activated fatty acids. The main beta- 
oxidation pathway degrades both saturated and unsaturated fatty acids while the auxiliary pathway 
performs additional steps required for the degradation of unsaturated fatty acids. 

10 The pathways of mitochondrial and peroxisomal beta-oxidation use similar enzymes, but 

have different substrate specificities and functions. Mitochondria oxidize short-, medium-, and long- 
chain fatty acids to produce energy for cells. Mitochondrial beta-oxidation is a major energy source 
for cardiac and skeletal muscle. In liver, it provides ketone bodies to the peripheral circulation when 
glucose levels are low as in starvation, endurance exercise, and diabetes (Eaton, S. et al. (1996) 

15 Biochem J. 320:345-357). Peroxisomes oxidize medium-, long-, and ver^'-long-cham fatty acids, 
dicarboxylic fatty acids, branched fatty acids, prostaglandins, xenobiotics, and bile acid 
intermediates. The chief roles of peroxisomal beta-oxidation are to shorten toxic lipophilic 
carboxylic acids to facilitate their excretion and to shorten very4ong-chain fatty acids prior to 
mitochondrial beta-oxidation (Mannaerts, G.P. and P.P. van Veldhoven (1993) Biochimie 75:147- 

20 158). 

Enzymes involved in beta-oxidation include acyi CoA synthetase, carnitine acyltransferase, 
acyl CoA dehydrogenases, enoyl CoA hydratases, L-3-hydroxyacyl CoA dehydrogenase, P- 
ketothiolase, 2,4-dienoyl CoA reductase, and isomerase. 
Lipid Cleavage and Degradation 

25 Triglycerides are hydrolyzed to fatty acids and glycerol by lipases. Lysophospholipases 

(LPLs) are widely distributed enzymes that metabolize intracellular lipids, and occur in numerous 
isoforms. Small isoforms, approximately 15-30 kD, function as hydrolases; large isoforms, those 
exceeding 60 kD, function both as hydrolases and transacylases. A particular substrate for LPLs, 
lysophosphatidylcholine, causes lysis of cell membranes when it is formed or imported into a cell. 

30 LPLs are regulated by lipid factors including acylcamitine, arachidonic acid, and phosphatidic acid. 
These lipid factors are signaling molecules iD[q)ortant in numerous pathways, including the 
inflammatory response. (Anderson, R. et al. (1994) Toxicol. Appl. Pharmacol. 125:176-183; Selle, H. 
et al. (1993); Eur. J. Biochem. 212:411-416.) 

The secretory phospholipase Aj (PLA2) superfamily comprises a number of heterogeneous 

35 enzymes whose common feature is to hydrolyze the sn-2 fatty acid acyl ester bond of 
phosphoglycerides. Hydrolysis of the glycerophospholipids releases free fatty acids and 
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phosphoUpi* ttty mds, suoh » «.cUdo«ic «»d ««i lyscshosphoupds. 

Carhon and PflThohydrate Metabolism 

carbohydrates, including sugars or saccharides, starch, and ceUulose. are aldehyde or ketone 
con^poundswithnndtiplehydroxylgroups. The in^ortance of carbohydrate metabolum as 
denmstratedbythesensitiveregulatorysysleminplaceforn^tenanceofbl^^ 

IVo pancreatic horn^ones. insulin and glucagon, i^mote increa^^ glu-^ 
cells.andincreasedglucosereleaseftomceUs.respectively. Carbohydrates have three in^portant 
roles m n^an^nalian cells. Hrst, carbohydrates are used as energy stores, fuels, and metabohc 
intermediates. Carbohydrates are broken dow to form energy in glycolysis and are stored as 
glycogen for later use. Second, the sugars deoxyribose and ribose fonn part of the structural support 
of DNA and FNA. respectively. Third, carbohydrate raodifications are added to secreted and 
^nembrane protems and Upids as they traverse the secretory pathway. Cell surface carbohydrate- 
containMg n^cromolecules. mcluding glycoproteins, glycohpids. and transmen^brane proteoglycans, 
n^diate adhesion withother cells and with components of the extraceUularrnatrix-Theex^^^^^^^^^ 

n>atrixiscon^sedofdiverseglycoproteins.glycosan^oglycans(GAGs),andcarboh^ 
proteins which are secreted from the ceU and assembled into an organized meshwork m close 
association withthecell surface. The interaction of the cell with the surrounding matrix profoundly 
influences ceU,shape, strength, flexibihty, motility, and adhesion. These dynamic propertres are 
intimately associated with signal transduction pathways controlling cell proliferation and 
differentiation, tissue constmction, and embryonic development. 

Carbohydrate metabolism is altered m several disorders including diabetes melhtus. 
hyperglycemia, hypoglycenua. galactosemia, galactokinase deficiency, and UDP-galactose4- 
3 epimerasedefrciencyCFaud.A.S.etal.(1998)Mea:.P^^ 

Hill New York NY. pp. 2208-2209). Altered carbohydrate metaboUsm is associated with cancer. 
Reduced GAGand proteoglycan expression is assodatedwithhumanlung carcinomas (Nackaer.^^ 
etal (1997) hit. J. Cancer 74:335-345). m carbohydrate detemiinants sialyl I^wis A and sialyl 
I^wis X are frequently expressed on human cancer ceUs (Kannagi, K (1997) Glycoconj. J. 14:577- 
0 584) AlterationsoftheN-linkedcarbohydratecorestructureofcellsurfaceglycopmteinsarelmte^ 
tocolonandpancreaticcancers (Schwarz, RJE. et al. (1996) Cancer Utt. 107:285-291). Reduced 
expression of the Sda blood group carbohydrate structure m cell surface glycolipids and glycoprotems 
isobservedingastromtestinalc^cer (Dohi, T. et al. (1996) Int J. Cancer 67:626-663). (Carbonand 
carbohydrate metabolism is reviewed m Stryer. L. (1995) gigc^ W.H. Freeman and Company 
.s NewYorkNY;Lehninger.Ai.(1982) E«...fBi2d^ Worth P^^^^^^^^ 

NYandU^sh.H. etal. (1995) MnlerularCellBiologySdentific American Books. NewYorkNY.^ 
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Glycolysis 

Enzymes of the glycolytic pathway convert the sugar glucose to pyruvate while 
simultaneously producing ATP. The pathway also provides building blocks for the synthesis of 
cellular components such as long-chain fatty acids. After glycolysis, pyrvuate is converted to acetyl- 
5 Coenzyme A, which, in aerobic organisms, enters the citric acid cycle. Glycolytic enzymes include 
hexokinase, phosphoglucose isomerase, phosphofructokinase, aldolase, triose phosphate isomerase, 
glyceraldehyde 3-phosphate dehydrogenase, phosphoglycerate kinase, phosphoglyceromutase, 
enolase, and pyruvate kinase. Of these, phosphofructokinase, hexoldnase, and pyruvate kinase are 
important in regulating the rate of glycolysis. 
10 Gluconeo^enesis 

Gluconeogenesis is the synthesis of glucose firomnoncarbohydrate precursors such as lactate 
and amino acids. The pathway, which ftmctions mainly in times of starvation and intense exercise, 
occurs mosfly in the liver and kidney. Responsible enzymes include pyruvate carboxylase, 
phosphoenolpyruvate carboxykinase, fructose 1,6-bisphosphatase, and glucose-6-phosphatase. 
15 Pentose Phosphate Pathway 

Pentose phosphate pathway enzymes are responsible for generating the reducing agent 
NADPH, while at the same time oxidizing glucose-6-phosphate to ribose-5-phosphate. Ribose-5- 
phosphate and its derivatives become part of important biological molecules such as ATP, Coenzyme 
A, NAD"^, FAD, RNA, and DNA. The pentose phosphate pathway has both oxidative and non- 
20 oxidative branches. The oxidative branch steps, which are catalyzed by the enzymes glucose-6- 
phosphate dehydrogenase, lactonase, and 6-phosphogluconate dehydrogenase, convert glucose-6' 
phosphate and NADP"^ to ribulose-6-phosphate and NADPH. The non-oxidative branch steps, which 
are catalyzed by the enzymes phosphopentose isomerase, phosphopentose epimerase, transketolase, 
and transaldolase, allow the interconversion of three-, four-, five-, six-, and seven-carbon sugars. 
25 Glucouronate Metabolism 

Glucuronate is a monosaccharide which, in the form of D-glucuronic acid, is found in the 
GAGs chondroitin and dermatan. D-glucuronic acid is also important in the detoxification and 
excretion of foreign organic compounds such as phenol. Enzymes involved in glucuronate 
metabolism include UDP-glucose dehydrogenase and glucuronate reductase. 
30 Disaccharide Metabolism 

Disaccharides must be hydrolyzed to monosaccharides to be digested. Lactose, a 
disaccharide found in mOk, is hydrolyzed to galactose and glucose by the enzyme lactase. Maltose is 
derived from plant starch and is hydrolyzed to glucose by the enzyme maltase. Sucrose is derived 
from plants and is hydrolyzed to glucose and fructose by the enzyme sucrase. Trehalose, a 
35 disaccharide found mainly in insects and mushrooms, is hydrolyzed to glucose by the enzyme 

trehalase (OMIM *275360 Trehalase; Ruf, J. et al. (1990) J. Biol. Chem. 265:15034-15039). Lactase, 
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J A- -,«/iifipr isiibunit a-lactalbumm, converts Ul^r- 
catalyticsubunitgalactosyltransferase and the modifier subumt a 1 cai 

galactose and glucose to lactose in the ffiammary glands. 

niym pf^n. Starch , -ir-^ Metabolism 

^.in.™«i»i.SP=*og»=(B».ILG......(1995)I.BioLCb«. 27^.26252.26256). 

P^ti^nalycans niYrnHaminoglvCMS 

-^y^^o^^an.;^^ «.«c U«„ ^branched p„„..cctand« co»^»ed », 

^u=o.a«..~»...«—.GAG«=*fr.= o.aap.« of proteoglycans. la,g»=»^ 
Uosedo,.co»p,o..ina,u^ed.o^o,„.,eOA<3s.G.Gaa„fo™do.>^»Ua^. 

, ._d.aea..i.l>>dM,a_e.,,oidd.ea».a«— d.^"^^^^ 

™,e„.c .up,. (H..^., C. « al. (1996) Oi. Exp. Kheum. 14 (S.ppl. 1 ).S59^67). 

:ri.clc.o^dna^a..^-«a..h=pa*.hep»».sU^,d— 

30 »ftc™n=c,ivato»s,and..b.^...y~vi..«(«.siBd«.A.^e„l.a«3,.^XBxp. 
PatW 74:27-34). HA se.,» .o play tapo«a« «>les b cell regulation. devdopn«.t. »d 
d«e,«Uadon(La»e«.T.aand>J.^a992)FMEBI.6:23«-2404).Hy^wo«^ 
e.^e.hardegr»..sHA»«Ugos.ccbarid.s. H,..»o,dd.aes,.»r,*»c«oaincellaahes>on, 

i.fec«o».angiogenesU.signd».««on."P"«'=°'«--''°^''^ 

ProBoglyc«>a.-.ol-o««.spepddogl,cam..refounaina.e«racdldar-»..nxot 

co^«a::.«..aca.mage.^a..,se.dal^d.s*«l^^«>"«*-'-"-- 
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Cell-surface-attached proteoglycans anchor cells to the extracellular matrix. Both extracellular and 

cell-surface proteoglycans bind growth factors, facilitating their binding to cell-surface receptors and 

subsequent triggering of signal transduction padiways. 

Amino Acid and Nitrogen Metabolism 

5 NH^* is assimilated into amino acids by the actions of two enzymes, glutamate 

dehydrogenase and glutamine synthetase. The carbon skeletons of amino acids come from the 

intermediates of glycolysis, the pentose phosphate pathway, or the citric acid cycle. Of the twenty 

amino acids used in proteins, humans can synthesize only thirteen (nonessential amino adds). The 

remaining nine must come from the diet (ess^tial amiuo acids). Enzymes involved in nonessential 

10 amino acid biosynthesis include glutamate kinase dehydrogenase, pyrroline carboxylate reductase, 
asparagine synthetase, phenylalanine oxygenase, methionine adenosyltransferase, 
adenosylhomocysteinase, cystathionine P-synthase, cystathionine y-lyase, phosphoglycerate 
dehydrogenase, phosphoserine transaminase, phosphoserine phosphatase, serine 
hydroxyhnethyltransferase, and glycine synthase. 

15 Metabolism of amino acids takes place almost entirely in the liver, where the amino group is 

removed by aminotransferases (transaxoinases), for example, alanine aminotransferase. The amino 
group is transfeixed to a-ketoglutarate to form glutamate. Glutamate dehydrogenase converts 
glutamate to NH/ and a-ketoglutarate. NH4* is converted to urea by the urea cycle which is 
catalyzed by the enzymes arginase, ornithine transcarbamoylase, arginosuccinate synthetase, and 

20 arginosuccinase. Carbamoyl phosphate synthetase is also involved in urea formation. Enzymes 
involved in the metabolism of the carbon skeleton of amino acids include serine dehydratase, 
asparaginase, glutaminase, propionyl CoA carboxylase, methylmalonyl CoA mutase, bxanched-chain 
a-keto dehydrogenase complex, isovaleryl CoA dehydrogenase, P-methylcrotonyl CoA carboxylase, 
phenylalanine hydroxylase, p-hydroxylphenylpyruvate hydroxylase, and homogentisate oxidase. 

25 Polyamines, which include spermidine, putrescine, and spermine, bind tightly to nucleic 

acids and are abundant in rapidly proliferating cells. Enzymes involved in polyamine syntihiesis 
include ornithine decarboxylase. 

Diseases mvolved in amiuo acid and nitrogen metabolism include hyperammonemia, 
carbamoyl phosphate synthetase deficiency, urea cycle enzyme deficiencies, methylmalonic aciduria, 

3 0 maple syrup disease, alcaptonuria, and phenylketonuria. 
Energv Metabolism 

Cells derive energy from metabolism of ingested compounds that may be roughly categorized 
as carbohydrates, fats, or proteins. Energy is also stored in polymers such as triglycerides (fats) and 
glycogen (carbohydrates). Metabolism proceeds along separate reaction pathways connected by key 
35 intermediates such as acetyl coenzyme A (acetyl-CoA). Metabolic pathways feature anaerobic and 
aerobic degradation, coupled with the energy-requiring reactions such as phosphorylation of 
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aderu?e SK^te (ADP) to the triphosphate (ATP) or analogous phosphorylations of guanosine 
(GDP/GTP). uridine CUDP/UTP). or cytidine (CDP/CIP). Subsequent dephosphorylaUon of the 
triphosphate drives reactions needed for ceD maintenance, growth, and proliferation. 

Digestive enzymes convert carbohydrates and sugars to gtacose; fructose and galactose are 
5 convertedintheUvertogluco^. Enzymes involved in these conversions include galactose-l- 
phosphateuridyltransferaseandlJDP-galactose^epin^e. inthe cytoplasm. 

glucose to pyruvate in a series of reactions coupled to ATP synthesis. 

Pyn^ate is transported into the mitochondria and converted to acetyl-CoA f» 

thedtricacidcycle.involvingpyruvatedehydrogenasecomponents.dihydroUpoy^ 
,0 and dihydroUpoyl dehydrogenase. Enzymes mvolved in the citric acid cycle include: crtrate 
synthetase, aconitases. isocitote dehydrogenase, alpha-ketoglutarate dehydrogenase complex 
including transsuccinylases, succinyl CoA synthetase, succinate dehydrogenase, fumarases, and 
malate dehydrogenase. Acetyl CoA is oxidized to CO, with concomitant formation of NADH. 
FADH, andGTP. In oxidative phosphorylation, the transport of electrons from NADH and FADH, 
xs tooxyglnbydehydrogenasesiscoupledtothesynthesisofATPfromADPandP^bytheFoF, 

ATPase complexin the mitochondrial hmer membrane. Enzyme complexes responsible forelectron 
transport andATP synthesis include dxeFoF^ATPase complex, ubiqumone(CoQ)-cytoch^^^^ 
xeductase.ubiquinonereductase.cytochromeb.cytochromec,FeS protein, and cytochromec 

oxidase. 

Triglyceridesarehydrolyzedtofattyacidsandglycerolbylipases. Glycerols then 
phosphorylated to glycerol-S-phosphate by glycerol kinase and glycerol phosphate dehydrogenase, 
and degraded by ±e glycolysis. Fatty acids are transported into the mitochondria as fatty acyl- 
carnitine esters and undergo oxidative degradation. 

In addition to metaboUc disorders such as diabetes and obesity, disorders of energy 
.5 metaboUsm are associated with cancers (Dorward. A. et al. (1997) J. Bioenerg. Biomembr. 29:385- 
39^) autism (Lombard. J. (1998) Med. Hypotheses 50:497-500), neurodegenerative disorders (Alexi. 
T et al. (1998)Neuroreport9:R57-64). and neuromuscular disorders (DiMauro. S. etal. (1998) 
Biochim. Biophys. Acta 1366:199-210). The myocardium is heavily dependent on oxidative 
xnetaboUsm. so metabohcdysfunctionoftenleads to heart disease (DiMauro. S.andM.Hi«^^ 

30 (1998) Curr. Opin. Cardiol. 13: 190-197). 

For areview of energy metabolism enzymes and intermediates, see Stryer.L. et al. (1995) 
Siggh^. W.H. Freeman and Co.. San Francisco CA. pp. 443-652. For a review of energy 
metabohsmregulation, seeLodish. H. et al. (1995) MoleadarCdmeto. Scientific American 
Books, New York NY, pp. 744-770. 

35 rnfactor Metabolism 

Cofactors. including coenzymes and prosthetic groups, are small molecular weight inorgamc 
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Or organic compounds that are reqiiiied for the action of an enzyme. Many cofactors contain 

vitamins as a conqponent Cofactors include thiamine pyrophosphate, flavin adenine dinucleotide, 

flavin mononucleotide, nicotinanude adenine dinucleotide, pyridoxal phosphate, coenzyme A, 

tetrahydrofolate, lipoamide, and heme. The vitamins biotm and cobalamin are associated with 

5 enzymes as well. Heme, a prosthetic group found in myoglobin and hemoglobin, consists of 

protoporphyrin group bound to iron. Porphyrin groups contain four substituted pyrroles covalently 

joined in a ring, often with a bound metal atom. Enzymes involved in porphyrin synthesis include 5- 

aminolevulinate synthase, 5-aminolevulinate dehydrase, porphobilinogen deaminase, and cosyndiase. 

Deficiencies in heme foimadon cause porphyrias. Heme is broken down as a part of erythrocyte 

10 turnover. Enzymes involved in heme degradation include heme oxygenase and biliverdin reductase. 

Iron is a required cofactor for many enzymes. Besides the heme-containing enzymes, iron is 
found in iron-sulfur clusters in proteins including aconitase, succinate dehydrogenase, and NADH-Q 
reductase. Iron is transported in the blood by the protein transferrin. Binding of transferrin to the 
transferrin receptor on cell surfaces allows uptake by receptor mediated endocytosis. Cytosolic iron is 
15 bound to ferritin protein. 

A molybdenum-containing cofactor (molybdopterin) is found in enzymes including sulfite 
oxidase, xanthine dehydrogenase, and aldehyde OTcidase. Molybdopterin biosynthesis is performed 
by two molybdenum cofactor synthesizing enzymes. Deficiencies in these enzymes cause mental 
retardation and lens dislocation. Other diseases caused by defects in cofactor metabolism include 
20 pernicious anemia and methylmalonic aciduria. 
Secretion and Trafficking 

Eukaryotic cells are bound by a lipid bilayer membrane and subdivided into functionally 
distinct, membrane bound compartments. The membranes maiutain the essential differences between 
the cytosol, the extracellular environment, and the lumenal space of each intracellular organelle. As 

2 5 lipid membranes are highly impermeable to most polar molecules, transport of essential iiutrients, 

metabolic waste products, cell signaling molecules, macromolecules and proteins across lipid 
membranes and between organelles must be mediated by a variety of transport-associated molecules. 
Protein Trafficking 

In eukaryotes, some proteins are synthesized on ER-bound ribosomes, co-translationally 

3 0 imported into the BR, delivered from the ER to the Golgi complex for post-translational processing 

and sorting, and transported from the Golgi to specific intracellular and extracellular destinations. 
All cells possess a constitutive transport process which maintains homeostasis between the cell and 
its enviromnent. In many differentiated ceU types, the basic machinery is modified to carry out 
specific transport functions. For example, in endocriae glands, hormones and other secreted proteins 
35 are packaged into secretory granules for regulated exocytosis to the cell exterior. In macrophage, 
foreign extracellular material is engulfed (phagocytosis) and delivered to lysosomes for degradation. 

37 
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InfMS&, glucose transporters are 
membrane only in response to insulin stinmlation. 

The Secretory Pathway ^^„jfnrth<» 
synthesis of rnost integral n3enihraBeproteins.secretedproteins.andp~^^ 

lumen of a particular organelle occurs on ER-bound ribosomes. The^ proteins are co WatxonaBy 

i^intotheER. The proteins leave theERvian^rnbrane-bound vesicles w^^^ 

at specific sites andfi^ev^theach other (hon^typic fusion) to formtheER^lgihite™^^^ 

Con5.ardnent(ERGIC). mERGICmaturesprogressively through the c«,mediai, and ^r.«. 

cisternal stacks of the Golgi. modifying the enzyniecon^itionbyre^ 

Golgienzyxnes. In this way. protems moving through the Golgiund«gopost-translational 

modification, such as glycosylation-ThefinalGolgi compartment is theTrans-GolgiNe^ 

where bothmembrane and lumenal proteins are sorted for their final destination. Transport ves^^^^^ 

destinedforintracellularcompartments.suchasthelysosome. budoff theTGN. What remains is a 
secretory vesicle which contams proteins destmed for the plasma membrane, such as receptors, 
adhesion molecules, and ion chamiels. and secretory proteins, such as hormones, neurotransnutters, 
and digestive enzymes. Secretory vesicles eventually fuse with the plasma membrane (Ghck. B.S. 
and V. Malhotra (1998) CeU 95:883-889). 

The secretory process can be constitutive or regulated. Most cells have a constitutwe 
pathway for secretion, whereby vesicles derived from maturation of the TON require no specific 
. signaltofusewiththeplasmamembrane. hi many ceHs, such as endocrine cells, digestive cells, and 
neurons, vesicle pools derived from the TON collect in the cytoplasm and do not fuse with the plasma 

membrane until they are dkected to by a specific signal. 
KTidocvtosis 

Endocytosis. wherem cells internalize material from the extraceUular environment, is 
5 essential for transmission of neuronal. metaboUc, and proHferative signals; uptake of many essential 
nutrients; and defense against invading organisms. Most ceUs exhibit two forms of endocytosis. The 
first phagocytosis, is an actin-driven process exemplifiedinmacrophageandneutrophils. Material to 
be endocytosed contacts numerous ceU surface receptors which sthnulate the plasma membrane to 
extend and surround the particle, enclosmg it m a membran^bomid phagosome. M the mammahan 
,0 immunesystem.IgG<oatedparticlesbmdFcrecepto«onthesurfaceofphagocyticleukocytes. 
Activation of the Fc receptors mitiates a signal cascade mvolvmg src-family cyto^^^^ 
monomeric GTP-binding (G) protem Rho. The resultmg actm reorganization leads to phagocytosis of 
the particle. This process is an important component of the humoral immune response, allowmg the 
processing and presentation of bacterial-derived peptides to antigen-specific T-lymphocytes. 
3 5 The second form of endocytosis, pmocytosis, is a more generalized uptake of matenal from 

the external miUeu. Like phagocytosis, pmocytosis is activated by ligand bmding to ceU surface 
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receptors. Activation of indiAddual receptors stimulates an internal response that includes 

coalescence of the recq)tor-ligand complexes and formation of clathrin-coated pits. Invagination of 

the plasma membrane at clathrin-coated pits produces an endocytic vesicle within the cell cytoplasm. 

These vesicles undergo homotypic fusion to form an early endosomal (EE) conq)artment The 

5 tubulovesicular EE serves as a sorting site for incoming material. ATP-driven proton pumps in the 

EE membrane lowers the pH of the EE lumen (pH 6.3-6.8). The acidic enviromneht causes many 

ligands to dissociate fix>m their receptors. The receptors, along with membrane and other integral 

membrane proteins, are recycled back to the plasma membrane by budding off the, tubular extensions 

of the EE in recyclmg vesicles (RV). This selective removal of recycled components produces a 

10 carrier vesicle containing ligand and other material from the external enviroranent The carrier 

vesicle fuses witii TGN-derived vesicles which contain hydrolytic enzymes. The acidic environment 
of the resulting late endosome (LE) activates the hydrolytic enzymes which degrade the ligands and 
other material. As digestion takes place, the LE fuses with the lysosome where digestion is 
conq)leted (MeUman, L (1996) Annu. Rev. Cell Dev. Biol. 12:575-625). 

15 Recycling vesicles may return directiy to the plasma membrane. Receptors internalized and 

returned directly to the plasma membrane have a tumover rate of 2-3 minutes. Some RVs undergo 
microtubule-directed relocation to a perinuclear site, from which they then return to the plasma 
membrane. Receptors following this route have a tumover rate of 5-10 minutes. Still other RVs are 
retained within the cell until an appropriate signal is received (Mellman, supra ; and James, D.E. et al. 

20 (1994) Trends CeU Biol. 4: 120-126). 
Vesicle Formation 

Several steps in the transit of material along the secretory and endocytic pathways require the 
formation of transport vesicles. Specifically, vesicles form at the transitional endoplasmic reticulum 
(tER), the rim of Golgi cistemae, the face of the Trans-Golgi Network (TGN), the plasma membrane 

25 (PM), and tubular extensions of the endosomes. The process begins with the budding of a vesicle out 
of the donor membrane. The membrane-bound vesicle contains proteins to be transported and is 
surrounded by a protective coat made up of protein subunits recruited from the cytosol. The initial 
budding and coating processes are controlled by a cytosolic ras-like OTP-binding protein, ADP- 
ribosylating factor (Arf), and adapter proteins (AP). Different isoforms of both Arf and AP are 

30 involved at different sites of budding. Another small G-protein, dynamin, forms a ring complex 

around the neck of the forming vesicle and may provide the mechanochemical force to accomplish the 
final step of the budding process. The coated vesicle complex is then transported through the cytosol. 
During the transport process, Arf-bound GTP is hydrolyzed to GDP and the coat dissociates from tiie 
transport vesicle (West, M.A. et al. (1997) J. Ceil Biol. 138:1239-1254). Two different classes of 

35 coat protein have also been identified. Clathrin coats form on the TGN and PM surfaces, whereas 
coatomer or COP coats form on the ER and Golgi. COP coats can fiirther be distinguished as COPI, 
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regulator of fusion in the neuron (Litdeton, J.T, et al. (1993) Cell 74: 1 125-1 134). The most abundant 

niembrane protein of synaptic vesicles appears to be the glycoprotein synaptophysin, a 38 kDa protein 

with four transmembrane domains. 

Specificity between a vesicle and its target is derived from the v-SNARE, t-SNAREs, and 

5 associated proteins mvolved. Different isoforms of SNAREs and Rabs show distinct cellular and 

subcellular distributions. VAMP-l/synaptobrevin, membrane-anchored synaptosome-associated 

protein of 25 kDa (SNAP-25), syntaxin-1, Rab3A, RablS. and Rab23 are predominandy expressed in 

the brain and nervous systenL Different syntaxin, VAMP, and Rab proteins are associated with 

distinct subcellular conq>artments and their vesicular carriers. 

10 Nuclear Transport 

Transport of proteins and RNA between the nucleus and the cytoplasm occurs through 
nuclear pore complexes (NPCs). NFC-mediated transport occurs in both directions through the 
nuclear envelope. All nuclear proteins are imported from the cytoplasm, their site of synthesis. 
tRNA and mRNA are exported from the nucleus, their site of synthesis, to the cytoplasm, their site of 

15 function. Processing of small nuclear RNAs involves export into the cytoplasm, assembly with 
proteins and modifications such as hypermethylation to produce small nuclear ribonuclear proteins 
(snRNPs), and subsequent inq>ort of the snRNPs back into the nucleus. The assembly of ribosomes 
requires the initial import of ribosomal proteins from the cytoplasm, their incorporation with RNA 
mto ribosomal subunits, and export back to the cytoplasm. (G5rlich, D. and I.W. Mattaj (1996) 

20 Science 271:1513-1518.) 

The transport of proteins and mRNAs across the NPC is selective, dependent on nuclear 
localization signals, and generally requires association with nuclear transport factors. Nuclear 
localization signals (NLS) consist of short stretches of amino acids enriched in basic residues. NLS 
are found on proteins that are targeted to the nucleus, such as the glucocorticoid receptor. The NLS is 

25 recognized by the NLS receptor, importin, which then interacts with the monomeric GTP-binding 
protein Ran. This NLS protein/receptor/Ran complex navigates the nuclear pore with the help of the 
homodimeric protein nuclear transport factor 2 (NTF2). NTF2 binds the GDP-bound form of Ran 
and to multiple proteins of the nuclear pore complex containing FXFG repeat motifs, such as p62. 
(Paschal, B. et ai. (1997) J. Biol. Chem. 272:21534-21539; and Wong, D.H. et al. (1997) Mol. Cell 

30 Biol. 17:3755-3767). Some proteins are dissociated before nuclear mRNAs are transported across the 
NPC while others are dissociated shortly after nuclear mRNA transport across the NPC and are 
reimported into the nucleus. 
Disease Correlation 

The etiology of numerous human diseases and disorders can be attributed to defects in the 
35 transport or secretion of proteins. For example, abnormal hormonal secretion is linked to disorders 
such as diabetes insipidus (vasopressin), hyper- and hypoglycemia (insulin, glucagon). Grave's 
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nephrogenic diabetes insipidus (OMM ^101111 Aquaporin 2; AQP2). Reduced AQP4 expression in 

skeletal muscle may be associated with Duchenne muscular dystrophy (Frigeri, A. et al. (1998) J. 

Clin. Invest. 102:695-703). Mutations in AQPO cause autosomal dominant cataracts in the mouse 

(OMIM *154050 Major Intrinsic Protein of Lens Fiber; MIP). 
5 The metallothioneins (MTs) are a group of small (61 amino acids), cysteine-rich proteins that 

bind heavy metals such as cadmium, zinc, mercury, lead, and copper and are thought to play a role in 

metal detoxification or the metabolism and homeostasis of metals. Arsenite-resistance proteins have 

been identified in hamsters that are resistant to toxic levels of arsenite (Rossman, T.G. et al. (1997) 

Mutat. Res. 386:307-314). 
10 Humans respond to light and odors by specific protein pathways. Proteins involved in light 

perception include rhodopsin, transducin, and cGMP phosphodiesterase. Proteins involved in odor 

perception include multiple olfactoiy receptors. Other proteins are inq)ortant in human Circadian 

rhythms and responses to wounds. 

Immunity and Host Defense 
15 AH vertebrates have developed sophisticated and complex immune systems that provide 

protection from viral, bacterial, fungal and parasitic infections. Included in these systems are the 

processes of humoral immunity, the complement cascade and the inflammatory response (Paul, W.E. 

(1993) Fundamental Immunologv . Raven Press, Ltd., New York NY, pp. 1-20). 

The cellular components of the humoral immune system include six different types of 
20 leukocytes: monocytes, lymphocytes, polymorphonuclear granulocytes (consisting of neutrophils, 

eosinophils, and basophils) and plasma cells. Additionally, fragments of megakaryocytes, a seventh 

type of white blood cell in the bone marrow, occur in large numbers in the blood as platelets. 

Leukocytes are formed from two stem cell lineages in bone marrow. The myeloid stem ceU 

line produces granulocytes and monocytes and, the lymphoid stem cell produces lymphocytes. 
2 5 Lymphoid cells travel to the thymus, spleen and lymph nodes, where they mature and differentiate 

into lymphocytes. Leukocytes are responsible for defending the body against invading pathogens. 

Neutrophils and monocytes attack invading bacteria, viruses, and other pathogens and destroy them 

by phagocytosis. Monocytes enter tissues and differentiate into macrophages which are extremely 

phagocytic. Lymphocytes and plasma cells are a part of the immune system which recognizes 
30 specific foreign molecules and organisms and inactivates them, as well as signals other cells to attack 

the invaders. 

Granulocytes and monocytes are formed and stored in the bone marrow until needed. 
Megakaryocytes are produced in bone marrow, where they fragment into platelets and are released 
into the bloodstream. The main function of platelets is to activate the blood clotting mechanism. 
35 Lymphocytes and plasma cells are produced in various lymphogenous organs, including the lymph 
nodes, spleen, thymus, and tonsils. 
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destroy the infected cell (Paul, supra\ 

T4ymphocytes originate in the bone marrow or liver in fetuses. Precursor cells migrate via 

the blood to the thymus, where they are processed to mature into T-lymphocytes. This processing is 

crucial because of positive and negative selection of T cells that will react with foreign antigen and 

5 not with self molecules. After processing, T cells continuously circulate in the blood and secondary 

lynq}hoid tissues, such as lymph nodes, spleen, certain epithelium-associated tissues in the 

gastrointestinal tract, respiratory tract and skin. When T-lymphocytes are presented with the 

complementary antigen, they are stimulated to proliferate and release large numbers of activated T 

cells into the lymph system and the blood system. These activated T cells can survive and circulate 

10 for several days. At the same time, T memory cells are created, which remain in the lymphoid tissue 
for months or years. Upon subsequent exposure to that specific antigen, these memory cells will 
respond more rapidly and with a stronger response than induced by the original antigen. This creates 
an "immunological memory" that can provide immunity for years. 

There are two major types of T cells: cytotoxic T cells destroy infected host cells, and helper 

15 T cells activate other white blood cells via chemical signals. One class of helper cell, T „!. activates 
macrophages to destroy ingested microorganisms, while another, T^2, stimulates the production of 
antibodies by B cells. 

Cytotoxic T cells directly attack the infected target cell. In virus-infected cells, peptides 
derived from viral proteins are generated by the proteasome. These peptides are transported into the 
20 ER by the transporter associated with antigen processing (TAP) (Pamer, E. and P. Cresswell (1998) 
Annu. Rev. Immunol. 16:323-358). Once inside the ER, the peptides bind MHC I chains, and the 
peptide/MHC I complex is transported to the cell surface. Receptors on the surface of T cells bind to 
antigen presented on cell surface MHC molecules. Once activated by binding to antigen, T cells 
secrete y-interferon, a signal molecule that induces the expression of genes necessary for presenting 

2 5 viral (or other) antigens to cytotoxic T cells. (Cytotoxic T cells kill the infected cell by stimulating 

programmed cell death. 

Helper T cells constitute up to 75% of the total T cell population. They regulate the immune 
functions by producing a variety of lymphokines that act on other cells in the immune system and on 
bone marrow. Among these lymphokines are: interleukins-2,3,4,5,6; granulocyte-monocyte colony 

3 0 stimulating factor, and y-interferon. 

Helper T cells are required for most B cells to respond to antigen. When an activated helper 
cell contacts a B cell, its centrosome and Gtolgi apparatus become oriented toward the B cell, aiding 
the directing of signal molecules, such as transmembrane-bound protein called CD40 ligand, onto the 
B cell surface to mteract with the CD40 transmembrane protein. Secreted signals also help B cells to 
3 5 proliferate and mature and, in some cases, to switch the class of antibody being produced. 

B-lymphocytes (B cells) produce antibodies which react with specific antigenic proteins 
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helper T-cells, which then secrete cytokines and other factors that stimulate the immune response. 

MHC molecules also play an important role in organ rejection following transplantation. Rejection 

occurs when the recq>ient's T-cells respond to foreign MHC molecules on the transplanted organ in 

the same way as to self MHC molecules bound to foreign antigen. (Reviewed m Alberts, B. et al. 

5 (1994) Molecular Biology of the Cell. Garland Publishmg, New York NY, pp. 12294246.) 

Antibodies, or immunoglobulms (Ig), are the foundmg members of the Ig superfamily and 

the central components of the humoral immune response. Antibodies are either expressed on the 

surface of B cells or secreted by B cells into die circulation. Antibodies bind and neutralize blood- 

borae foreign antigens. The prototypical antibody is a tetramer consisting of two identical heavy 

10 polypeptide chains (H-chains) and two identical light polypeptide chains CL-chams) mterlinked by 
disulfide bonds. This arrangement confers the characteristic Y-shape to antibody molecules. 
Antibodies are classified based on their H-chain composition. The five antibody classes, IgA, IgD, 
IgE, IgG and IgM, are defined by the a, 6, e, y, and fi H-chain types. There are two types of L- 
chains, k and A, either of which may associate as a pair with any H-chain pair. IgG, the most 

15 conmion class of antibody found in the circulation, is tetrameric, while the other classes of antibodies 
are generally variants or multimers of this basic structure. 

H-chains and L-chains each contain an N-terminal variable region and a C-tenninal constant 
region. The constant region consists of about 1 10 amino acids in L-chains and about 330 or 440 
amino acids in H-chains. The amino acid sequence of the constant region is nearly identical among 

20 H- or L-chains of a particular class. The variable region consists of about 110 amino acids in both H- 
and L-chains. Both H-chains and L-chains contain repeated Ig domams. For example, a typical H- 
chain contains four Ig domains, three of which occur within the constant region and one of which 
occurs withm the variable region and contributes to the formation of the antigen recognition site. 
Likewise, a typical L-chain contains two Ig domains, one of which occurs within the constant region 

2 5 and one of which occurs within the variable region. In addition, H chains such as /i have been shown 

to associate widi other polypeptides during differentiation of the B cell. The amino acid sequence of 
the variable region differs among H- or L-chains of a particular class. Within each H- or L-chain 
variable region are three hypervariable regions of extensive sequence diversity, each consisting of 
about 5 to 10 amino acids. In the antibody molecule, the H- and L-chain hypervariable regions come 

3 0 together to form the antigen recognition site. (Reviewed in Alberts, supra , pp. 1206-1213 and 1216- 

1217.) 

Antibodies can be described in terms of their two mam functional doniains. Antigen 
recognition is mediated by the Fab (antigen bindmg ftagment) region of the antibody, while effector 
functions are mediated by the Fc (crystallizable ftagment) region. Binding of antibody to an antigen, 
3 5 such as a bacterium, triggers the destruction of the antigen by phagocytic white blood cells such as 
macrophages and neutrophils. These ceUs express surface receptors that specifically bind to the 
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serve to define the target and initiate the complement system cascade, culn[iinating in the destruction 

of the infectious agent Jn this pathway, since the antibody guides mitiation of flie process, the 
complement can be seen as an effector arm of the humoral immune system. 

The alternative pathway of the complement system does not require the presence of pre- 
5 existing antibodies for targetmg infectious agent destruction. Rather, this pathway, through low 
levels of an activated component, remains constantly primed and provides surveillance m the non- 
immune host to enable targeting and destruction of infectious agents. In this case foreign material 
triggers die cascade, thereby facilitating phagocytosis or lysis (Paul, supra, pp.918-919). 

Another important component of host defense is the process of inflammation. Inflamnoatoiy 
10 responses are divided iato four categories on the basis of pathology and mclude allergic 

inflammation, cytotoxic antibody mediated inflanomation, immune complex mediated inflammation 
and monocyte mediated inflammation. Liflanamation manifests as a combination of each of these 
forms with one predominating. 

AUergic acute inflammation is observed in individuals wherein specific antigens stimulate 
15 IgE antibody production. Mast cells and basophils are subsequently activated by the attachment of 
antigen-IgE complexes, resulting in the release of cytoplasmic granule contents such as histamine. 
The products of activated mast cells can increase vascular penneability and constrict the smooth 
muscle of breathing passages, resulting in anaphylaxis or asthma. Acute mflammation is also 
mediated by cytotoxic antibodies and can result in the destruction of tissue through the binding of 
2 0 complement-fixing antibodies to cells. The responsible antibodies are of the IgG or IgM types. 
Resultant clinical disorders include autoimmime hemolytic anemia and thrombocytopenia as 
associated with systemic lupus eiy thematosis. 

Immune complex mediated acute inflammation involves the IgG or IgM antibody types 
which combine with antigen to activate the complement cascade. When such immune complexes 

2 5 bind to neutrophils and macrophages they activate the respiratory burst to form protein- and vessel- 

damagiDg agents such as hydrogen peroxide, hydroxyl radical, hypochlorous acid, and chloramines. 
Clinical manifestations include rheumatoid arthritis and systemic lupus erythematosus. 

In chronic inflammation or delayed-type hypersensitivity, macrophages are activated and 
process antigen for presentation to T ceUs that subsequently produce lymphokines and monokines. 

3 0 This type of inflammatory response is likely important for defense against intracellular parasites and 

certain viruses. Clinical associations include, granulomatous disease, tuberculosis, leprosy, and 
sarcoidosis (Paul, W.E., supra, pp.1017-1018). 

Matrix proteins (MPs) are transmembrane and extracellular proteins which function in 
formation, growth, remodelmg, and maintenance of tissues and as inq>Qrtant mediators and regulators 
35 of the inflammatory response. The expression and balance of MPs may be perturbed by biochemical 
changes fliat result from congaiital, epigenetic, or infectious diseases. In addition, MPs affect 
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construction and organogenesis, in which cell proliferation, cell differentiation, and morphogenesis 

must be spatially and temporally regulated in a precise and coordinated manner. Cells communicate 

witii one another through the secretion and uptake of diverse types of signaling molecules such as 

homiones, growth factors, neuropeptides, and cytokines. 

5 Hormones 

Hormones are secreted molecules that travel through the circulation and bind to specific 
receptors on the surface of, or within, target cells. Although they have diverse biochemical 
compositions and mechanisms of action, hormones can be grouped into two categories. One category 
includes small lipophilic hormones that diffuse through the plasma membrane of target cells, bind to 

10 cytosolic or nuclear receptors, and form a conq>lex that alters gene expression. Exaiiq)les of these 
molecules include retinoic acid, thyroxine, and the cholesterol-derived steroid hormones such as 
progesterone, estrogen, testosterone. Cortisol, and aldosterone. The second category includes 
hydrophilic hormones that function by binding to cell surface receptors that transduce signals across 
the plasma membrane. Examples of such hormones include amino acid derivatives such as 

15 catecholamines and peptide hormones such as glucagon, insulin, gastrin, secretin, cholecystokinin, 
adrenocorticotropic hormone, follicle stimulating hormone, luteinizing hormone, thyroid stimulating 
honnone, and vasopressin. (See, for example, Lodish et al. (1995) Molecular C!ell Biologv> Scientific 
American Books Inc., New York NY, pp. 856-864.) 

Hormones are signaling molecules that coordinately regulate basic physiological processes 

20 from embryogenesis throughout adulthood. These processes include metabolism, respiration, 
reproduction, excretion, fetal tissue differentiation and organogenesis, growth and development, 
homeostasis, and the stress response. Hormonal secretions and the nervous system are tightly 
integrated and interdependent. Hormones are secreted by endocrine glands, primarily the 
hypothalamus and pituitary, the thyroid and parathyroid, the pancreas, the adrenal glands, and the 

2 5 ovaries and testes. 

The secretion of hormones into the circulation is tightly controlled. Hormones are often 
secreted in diumal, pulsatile, and cyclic pattems. Hormone secretion is regulated by perturbations in 
blood biochemistry, by other upstream-acting hormones, by neural impulses, and by negative 
feedback loops. Blood hormone concentrations are constandy monitored and adjusted to maintain 

30 optimal, steady-state levels. Once secreted, hormones act only on those target cells that express 
specific receptors. 

Most disorders of the endocrine system are caused by either hyposecretion or hypersecretion 
of hormones. Hyposecretion often occurs when a hormone's gland of origin is damaged or otherwise 
impaired. Hypersecretion often results firom the proliferation of tumors derived from hormbne- 
35 secreting cells. Inappropriate hormone levels may also be caused by defects in regulatory feedback 
loops or in the processing of hormone precvu-sors. Endocrine malfunction may also occur when the 
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hypothyroidism include goiter, myxedema, acute thyroiditis associated with bacterial infection, 

subacute thyroiditis associated with viral infection, autoimmune thyroiditis (Hashimoto's disease), 

and ciedboism Disorders associated with hyperthyroidism include thyrotoxicosis and its various 

forms, Grave's disease, pretibial myxedema, toxic multinodular goiter, thyroid carcinoma, and 

5 Plummer' s disease. Disorders associated with hyperparathyroidism include Conn disease (chronic 

hypercalemia) leading to bone resorption and parathyroid hyperplasia. 

Hormones secreted by the pancreas regulate blood glucose levels by modulating the rates of 

carbohydrate, fat, and protein metabolism. Pancreatic hormones include insulin, glucagon, amylin, y- 

aminobutyric acid, gastrin, somatostatin, and pancreatic polypeptide. The principal disorder 

10 associated with pancreatic dysfunction is diabetes mellitus caused by insufficient insulin activity. 
Diabetes mellitus is generally classified as either Type I (insulin-dependent, juvenile diabetes) or 
Type H (non-insulin-dependent, adult diabetes). The treatment of both forms by insulin replacement 
therapy is well known. Diabetes mellitus often leads to acute complications such as hypoglycemia 
(insulin shock), coma, diabetic ketoacidosis, lactic acidosis, and chronic complications leading to 

15 disorders of the eye, kidney, skin, bone, joint, cardiovascular system, nervous system, and to 
decreased resistance to infection. 

The anatomy, physiology, and diseases related to hormonal function are reviewed in 
McCance, K.L. and S.E. Huether (1994) Pathophvsioloev: The Biological Basis for Disease in Adults 
and Children. Mosby-Year Book, Inc., St. Louis MO; Greenspan, F.S. and J.D. Baxter (1994) Basic 

2 0 and Clinical Endocrinology . Appleton and Lange, East Norwalk CT. 
Growth Factors 

Growth factors are secreted proteins that mediate intercellular conamunication. Unlike 
hormones, which travel great distances via the circulatory system, most growth factors are primarily 
local mediators that act on neighboring cells. Most growth factors contain a hydrophobic N-terminal 

25 signal peptide sequence which directs the growth factor into the secretory pathway. Most growth 
factors also undergo post-translational modifications within the secretory pathway. These 
modifications can include proteolysis, glycosylation, phosphorylation, and intramolecular disxilfide 
bond formation. Once secreted, growth factors bind to specific receptors on the surfaces of 
neighboring target cells, and the bound receptors trigger intracellular signal transduction pathways. 

30 These signal transduction pathways elicit specific cellular responses in the target cells. These 
responses can include the modulation of gene expression and the stimulation or inhibition of cell 
division, cell differentiation, and cell motility. 

Growth factors fall into at least two broad and overlapping classes. The broadest class 
includes the large polypeptide growth factors, which are wide-ranging in their effects. These factors 

35 mclude epidermal growth factor (EGF), fibroblast growth factor (FGF), transforming growth factor-P 
(TGF-P), insulin-Uke growth factor (IGF), nerve growth factor (NGF), and platelet-derived growth 
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epidennal. and connectivetissues. Members of theTGF-p.EGF.andFCa^Mes^^ 

inductive signals in the differentiation of embryonic tissue. NGF functions specifically as a 

neurotrophic factor, promoting neuronal growth and differentiation. 

Mother class of growthfactors includes the hematopoietic growthfactors. which are 

in their target specificity, msefactors stimulate theproliferationand^^^^ 
suchasB4ymphocytes.T4ymphocytes,erythrocytes.platelets.eosino^^^ 
niacrophages.andtheirstemcellprecursors. These factors include the colony-stimulating factors 
(G-CSF.M-CSF,GM-CSF,andCSF1.3).erythn,pdetm.andthecytoki^^^ Ihe cytokines are 
specialized hematopoietic factors secreted by cells of the immune system and are discussed m detail 



below. 



Growthfactors play criticalrolesinneoplastic transformation of ceUsm^^ 
^s progression avivo. Overexpression of the large polypeptide growth factors promotes the 

proliferation and transformation of cells in culture. Inappropriate expression of these growth factors 
bytumorcellsm^avomaycontributetotumorvascularizationandmem^^ Inappropriate activity 
of hematopoietic growthfactorscanresult in anemias. leukennas.andlymphomas. Moreover. 

g^wthfactorsareboth structurally andfuuctionallyrelatedtooncoproteins,thep^^^^^^^^^ 

,0 causingproductsofproto-oncogenes. Certain FGF and PDGF family members are themselves 
homologoustooncopro.eins,whereasreceptorsforsomemembersoftheEGF.NGF,andFGF 
faniiUesareencodedbyproto-oncogenes. Growth factors also affect the transcriptionalr^^^^^^^ 

both proto-oncogenes and oncosuppressor genes (Pimentel, E. (1994) mm^^^L^m^^^^S^ 
CRC Press. Ami Arbor MI; McKay. I. and I. Leigh, eds. (1993) fe3wd,F,ctor,,APmc^ 
as AEPI2^.OxfordUniversityPress.NewYorkNY;Habenicht.A..ed.(1990)G^^ 

T^;ff.^.ri..^^n Factor, .nd Cvtokines. Springer-Veriag, New York NY). 

maddition. some ofthelargepolypeptidegrowth factors play crucialrolesinthemducuon of 

theprimordialgermlayersinthedevelopingembryo. This induction ultimately results in the 
formation of the embryonic mesoderm, ectoderm, and endoderm which in turn provide the framework 
30 fortheentireadultbodyplan. Disruption of this inductive process would be cata^phic to ■ 
embryonic development. 

Sir'" P^ pti<^p. Factors - Npurnpentides and Vasomediators 

Neuropeptides and vasomediators (NP/VM)compriseafamily of small peptide factors, 

typicaUy of 20 amino acids or less. THese factors genemlly function in neuromd excitation and 
3S inhibition of vasoconstriction/vasodilation, muscle contraction, and hormonal secretions from the 
brain and other endocrinetissues.Includedin this Wyareneuropeptidesandneu^^^^^ 
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hormones such as bombesin, neuropeptide Y, neurotensin, neuromedin N, melanocortins, opioids, 

galanin, somatostatin, tachykinins, urotensin II and related peptides involved in smooth muscle 

stimulation, vasopressin, vasoactive intestinal peptide, and circulatory system-bome signaling 

molecules such as angiotensin, complement, calcitonin, endothelins, formyl-methionyl peptides, 

5 glucagon, cholecystokinin, gastrin, and many of the peptide hormones discussed above. NPA^s can 

transduce signals directly, modulate the activity or release of other neurotransmitters and hormones, 

and act as catalytic enzymes in signaling cascades. The effects of NPA^s range from extremely 

brief to.long-lasting. (Reviewed in Martin, C.R. et al. (1985) Endocrine Physiology, Oxford 

University Press, New York NY, pp. 57-62.) 

10 Cytokines 

Cytokines comprise a family of signaling molecules that modulate the immune system and the 
inflammatory response. Cytokines are usually secreted by leukocytes, or white blood cells, in 
response to injury or infection. Cytokines function as growth and differentiation factors that act 
primarily on cells of the immune system such as B- and T-lymphocytes, monocytes, macrophages, 
15 and granulocytes. Like other signaling molecules, cytokmes bind to specific plasma membrane 

receptors and trigger intracellular signal transduction pathways which alter gene expression pattems. 
There is considerable potential for the use of C3rtokines in the treatment of inflammation and immune 
system disorders. 

Cytokines are secreted by hematopoietic cells in response to injury or infection. Interleukins, 

20 neurotrophins, growth factors, interferons, and chemokines all define cytokine families that work in 
conjunction with cellular receptors to regulate cell proliferation and differentiation. In addition, 
cytokines effect activities such as leukocyte migration and function, hematopoietic cell proliferation, 
temperature regulation^ acute response to infection, tissue remodeling, and apoptosis. 

Cytokine structure and function have been extensively characterized in vitro . Most cytokines 

25 are small polypeptides of about 30 kilodaltons or less. Over 50 cytokines have been identified from 
human and rodent sources. Examples of cytokine subfamilies include the interferons (IFN-a, -P, and 
-y), the interleukins (IL1-IL13), the tumor necrosis factors (TNF-a and -P), and the chemokines. 
Many cytokines have been produced using recombinant DNA techniques, and the activities of 
individual cytokmes have been determined in vitro . These activities include regulation of leukocyte 

30 proliferation, differentiation, and motility. 

The activity of an individual cytokine in vitro may not reflect the full scope of that cytokine's 
activity in vivo . Cytokines are not expressed individually in vivo but are instead expressed in 
combination with a multitude of other cytokines when the organism is challenged with a stimulus. 
Together, these cytokines collectively modulate the immune response in a manner appropriate for that 

35 particular stimulus- Therefore, the physiological activity of a cytokine is determined by the stimulus 
itself and by complex interactive networks among co-expressed cytokines which may demonstrate 
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NC andM.C.Peitsch(1997)J.Leulcoc.Biol.61:545-550.) Chetnoldnes were initially identified as 
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subfannliesbasedond.epresenceofconservedcysteine4,asedmotifs. (CaH-d. R- and C3e^g> A 

a994)2i.C,t^^eactiB^Acad^^^ 

Recentevidence indicates that chenK,kmesn.yalsoplaytey roles inhematopoiesisandm^^^^ 
iMection. ChenK>ldnesaresn>anproteinswMchrangefromabout6-151dlodaltonsinmolecu^^ 
weight ChenK>kinesa«furthercla.ifiedasC,CC.CXC.orCX.Cbasedonthenumberand 
position of critical cysteine residues. Tl^eCCchernokines, for exarx^Keachcontainacons^^^ 
n^tifconsistingoftwoconsecutivecysteinesfoUowedby two additional cysteines which occur 

downstrean.at24-andl6-residueintervals.respectivelyCExPASyPROSrrEdat 
PS00472 and PDOC00434). Tb. presence and spacing of these four cysteine residues are highly 
conserved, whereas the intervening residues diverge significantly. However, a conserved tyrosine 
located about 15 residues downstreamofthe cysteine doublet seen, to be important for chern^^^^^ 

„^-i;^,T rr rhfmolcines are clustered on chromosome 17, 
activity Most of the human genes encoding CC chemoKines are ci 

„ althou^thereareafewexan^lesofCCchemoldnegenesthatmapelsewhere. ^^^^c^^^ 

include lymphotactin (C chemokme); macrophage chemotactic and activating factor (MCAF/MCP-1; 
CCchemoldne); platelet factor4andIL-8(CXCchemokines);andfractalkine and neurotrach^ 

(CX3C chemokines). (Reviewed in Luster. A.D. (1998) N. Engl. J. Med. 338:436445.) 
Receptor Molecules 

The term receptor describes proteins that specificaUy recognize other molecules. The 
categoryisbroadandincludespK>teinswithavarietyoffunctions. H^ebulkof receptors are cell 
surface proteins which bind extraceUular ligands and produce ceUular responses in the areas of 
growth differentiation, endocytosis. and hnmune response. Other receptors facilitate the selective 
.^nsport of proteins out of the endoplasmic reticulum and localize enrymestoparticularloc^^^ 
30 theceU. ThetemimayalsobeappUedtoproteinswhichactasreceptorsforUgandswithknownor 
^own chemical composition and which mteract with other cellular components. For example, the 
steroid hormone receptors bind to and regulate transcription of DNA. 

Regulation of cell proliferation, differentiation, and migration is important for the formaUon 
and function of tissues. Regulatory proteins such as growth fectors coordinately control tiiese cellular 
35 processesandactasmediatorsincell^ellsignalingpathways. Growth factors are ^ed proteins 
thatbind to specific cen-surfacereceptors on target ceUs.mbound receptors trigger intt^^^^^ 
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signal transduction pathways which activate various downstream effectors that regulate gene 

expression, cell division, ceU differentiation, cell motility, and other cellular processes. 

Cell surface receptors are typically integral plasma membrane proteins. These receptors 

recognize hormones such as catecholamines; peptide hormones; growth and differentiation factors; 

5 small peptide factors such as thyrotropin-releasing hormone; galanin, somatostatin, and tachykinms; 

and circulatory system-borne signaling molecules. Cell surface receptors on immune system cells 

recognize antigens, antibodies, and major histocon:q)atibility complex (MHC)-bound peptides. Other 

cell surface receptors bind ligands to be internalized by the cell. This receptor-mediated endocytosis 

functions in the uptake of low density lipoproteins (LDL), transferrin, glucose- or maimose-terminal 

10 glycoproteins, galactose-terminal glycoprotems, immunoglobulins, phosphovitellogenins, fibrin, 

proteinase-inhibitor conn^lexes, plasminogen activators, and thrombospondin (Lodish, H. et al. (1995) 
Molecular Cell Biology . Scientific American Books, New York NY, p. 723; Mikhailenko, L et al. 
(1997) J. Biol. Chcm 272:6784-6791). 
Receptor Protein Kinases 

15 Many growth factor receptors, including receptors for epidermal growth factor, 

platelet-derived growth factor, fibroblast growth factor, as well as the growth modulator a-thrombin, 
contain intrinsic protein Mnase activities. When growth factor binds to the receptor, it triggers the 
autophosphorylation of a serine, threonine, or tyrosine residue on the receptor. These phosphorylated 
sites are recognition sites for the binding of other cytoplasmic signaling proteins. These proteins 

2 0 participate in signaling pathways that eventually link the initial receptor activation at the cell surface 
to the activation of a specific intracellular target molecule. In the case of tyrosine residue 
autophosphorylation, these signaling proteins contain a common domain referred to as a Src 
homology (SH) domain. SH2 domains and SH3 domains are found in phospholipase C-y, PI-3-K p85 
regulatory subunit, Ras-GTPase activating protein, and ppSO^'^"" (Lowenstein, EJ. et al. (1992) Cell 

25 70:431-442).. The cytokine family of receptors share a different common binding domain and include 
transmembrane receptors for growth hormone (GH), interleukins, erythropoietin, and prolactin. 

Other receptors and second messenger-binding proteins have intrinsic serine/threonine 
protein kinase activity. These Include activin/TGF-p/BMP-superfamily receptors, calcium- and 
diacylglycerol-activated/phospholipid-dependant protein kinase (PK-C), and RNA-dependant protein 

30 kinase (PK-R). In addition, other serine/threonine protein kinases, including nematode Twitchin, 
have fibronectin-like, immunoglobulin C2-like domains. 
G-Protein Coupled Receptors 

G-protein coupled receptors (GPCR) are a superfamily of integral membrane proteins which 
transduce extracellular signals. GPCRs are characterized by the presence of seven hydrophobic 

35 transmembrane domains which span the plasma membrane and form a bundle of antiparallel alpha (a) 
hehces. In most cases, the bundle of a helices forms a binding pocket. These proteins range in size 
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GPCR mutatioiis, which may cause loss of function or constitutive activation, have been 

associated with numerous human diseases (Coughlin, supra). Mutations and changes in 

transcriptional activation of GPCR-encoding genes have been associated with neurological disorders 

such as schizophrenia^ Parkmson's disease, Alzheimer's disease, drug addiction, and feeding 

5 disorders. For instance, retinitis pigmentosa may arise fix)m mutations in the rhodopsin gene. 

Rhodopsin is the retinal photoreceptor which is located within the discs of the eye rod cell. Parma, J. 

et al. (1993, Nature 365:649-651) report that somatic activating mutations in tiie thyrotropin receptor 

cause hyperfunctioning thyroid adenomas and suggest that certain GPCRs susceptible to constitutive 

activation may behave as protooncogenes. 

10 Nuclear Receptors 

Nuclear receptors bind small molecules such as hormones or second messengers, leading to 
increased receptor-binding affinity to specific chromosomal DNA elements. In addition the affinity 
for other nuclear proteins may also be altered. Such binding and protein-rprotein interactions may 
regulate and modulate gene expression. Btamples of such receptors include the steroid hormone 

15 receptors family, tiie retinoic acid receptors family, and the thyroid hormone receptors family. 
Ligand-Gated Receptor Ion Channels 

Ligand-gated receptor ion channels fall into two categories. The first category, extracellular ' 
ligand-gated receptor ion channels (ELGs), rapidly transduce neurotransmitter-binding events into 
electrical signals, such as fast synaptic neurotransmission. ELG function is regulated by post- 

20 translational modification. The second category, intracellular ligand-gated receptor ion channels 
(ILGs), are activated by many intracellular second messengers and do not require post-translatioual 
modification(s) to effect a channel-opening response. 

ELGs depolarize excitable cells to the threshold of action potential generation. In non- 
excitable cells, ELGs permit a limited calcium ion-influx during the presence of agonist ELGs 

25 include channels directly gated by neurotransmitters such as acetylcholine, L-glutamate, glycine, 
ATP, serotonin, GAB A, and histamine. ELG genes encode proteins having strong structural and 
functional similarities. ILGs are encoded by distinct and unrelated gene families and include 
receptors for cAMP, cGMP, calcium ions, ATP, and metabolites of arachidonic acid. 
Macrophage Scavenger Receptors 

30 Macrophage scavenger receptors are integral membrane proteins with broad ligand specificity 

may participate in the binding of low density lipoproteins (LDL) and foreign antigens. Scavenger 
receptors types I and II are trimeric membrane proteins with each subunit containing a small N- 
terminal intracellular domain, a transmembrane domain, a large extracellular domain, and a C- 
terminal cysteine-rich domain. The extracellular domain contains a short spacer domain, an a-helical 

35 coiled-coil domain, and a triple helical collagenous domain. These receptors have been shown to 
bind a spectrum of ligands, including chemically modified lipoproteins and albumin, 
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30 IntraceUularSignaUng Molecules „ . . , 

Intracellular signaling is the general process by which cells respond to extracellular signals 
(hormones, neurotransnntters. growth and differentiation fectors. etc.) through a cascade of 
biochenucal reactions thatbegins with thebindingofasignaling molecule toaceUrr^^^ 

receptor and ends with the activationof an intracellular target tnolecule. Intermediate steps m the 
3S processinvolvetheactivationofvariouscytoplasmicproteinsbyphosphoryh^^^ 

and their deactivation by protein phosphatases, and the eventud translocation of some of t^^^ 
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activated proteins to the cell nucleus where the transcription of specific genes is triggered. The 

intracellular signaling process regulates all types of cell functions including cell proliferation, cell 

differentiation, and gene transcription, and involves a diversity of molecules including protein 

kinases and phosphatases, and second messenger molecules, such as cyclic nucleotides, calcium- 

5 calmodulin, inositol, and various mitogens, that regulate protein phosphorylation. 

Protein Phosphorylation 

Protein kinases and phosphatases play a key role in the mtracellular signaling process by 

controlling the phosphorylation and activation of various signaling proteins. The high energy 

phosphate for this reaction is generally transferred from the adenosine triphosphate molecule (ATP) 

10 to a particular protein by a protein kinase and removed from that protein by a protein phosphatase. 
Protein kinases are roughly divided into two groups: those that phosphorylate tyrosine residues 
(protein tyrosine kinases, PTK) and those that phosphorylate serine or threonine residues 
(serine/threonine kinases, STK). A few protein kinases have dual specificity for serine/threonine and 
tyrosine residues. Almost all kinases contain a conserved 250-300 amino acid catalytic domain 

15 containing specific residues and sequence motifs characteristic of the kinase family (Hardie, G. and 
S. Hanks (1995) The Protein Kinase Facts Books , Vol 1:7-20, Academic Press, San Diego CA). 

STKs include the second messenger dependent protein kinases such as the cyclic-AMP 
dependent protein kinases (PKA), involved in mediating hormone-induced cellular responses;, 
calcium-calmodulin (CaM) dependent protein kinases, involved in regulation of smooth muscle 

20 contraction, glycogen breakdown, and neurotransrnission; and the riiitogen-activated protein 

(MAP) which mediate signal transduction from the cell surface to the nucleus via phosphorylation 
cascades. Altered PKA expression is implicated in a variety of disorders and diseases including 
cancer, thyroid disorders, diabetes, atherosclerosis, and cardiovascular disease (Isselbacher, K. J. et al. 
(1994) Harrison's Principles of Intemal Medicine . McC3raw-Hill, New York NY, pp. 416-431, 1887). 

25 PTKs are divided into transmembrane, receptor PTKs and nontransmembrane, non-receptor 

PTKs, Transmembrane PTKs are receptors for most growth factors. Non-receptor PTKs lack 
transmembrane regions and, instead, form complexes with the mtracellular regions of cell surface 
receptors. Receptors that function through non-receptor PTKs include those for cytokines and 
hormones (growth hormone and prolactin) and antigen-specific receptors on T and B lymphocytes. 

3 0 Many of these PTKs were first identified as the products of mutant oncogenes in cancer cells in 
which their activation was no longer subject to normal cellular controls. In feet, about one third of 
the known oncogenes encode PTICs, and it is well known that cellular transformation (oncogenesis) is 
often accompanied by increased tyrosine phosphorylation activity (Charbormeau, H. and N.K. Tonks 
(1992) Annu. Rev. Cell Biol. 8:463-493). 

35 An additional family of protein kinases previously thought to exist only in procaryotes is the 

histidine protein kinase family (HPK). WPKs bear littie homology with mamanalian STKs or PTKs 
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pazticular, cyclic- AMP dependent protein kinases (PKA) axe thought to account for all of the effects 

of cAMP in most mannnalian cells, including various homone-induced cellular responses. Visual 

excitation and the phototransmission of light signals in die eye is controlled by cyclic-GMP 

regulated, Ca^^-specific channels. Because of the importance of cellular levels of cyclic nucleotides 

5 in mediating these various responses, regulating the synthesis and breakdown of cyclic nucleotides is 

an important matto. Thus adenylyl cyclase, which synthesizes cAMP from AMP, is activated to 

increase cAMP levels in muscle by binding of adrenaline to P-andreuergic receptors, while activation 

of guanylate cyclase and increased cGMP levels in photoreceptors leads to reopening of the 

Ca^*-specific channels and recovery of the dark state in the eye. In contrast, hydrolysis of cyclic 

10 nucleotides by cAMP and cGMP-specific phosphodiesterases (PDEs) produces the opposite of these 
and other effects mediated by increased cyclic nucleotide levels. PDEs appear to be particularly 
important in the regulation of cyclic nucleotides, considering the diversity found in this family of 
proteins. At least seven families of mammalian PDEs (PDEl-7) have been identified based otx 
substrate specificity and affinity, sensitivity to cofactors, and sensitivity to inhibitoiy drugs (Beavo, 

15 J.A. (1995) Physiological Reviews 75:725-748). PDE inhibitors have been found to be particularly 
useful in treating various clinical disorders. Rolipram, a specific inhibitor of PDE4, has been used in 
the treatment of depression, and similar inhibitors are undergoing evaluation as anti-inflammatory 
agents. Theophylline is a nonspecific PDE inhibitor used in the treatment of bronchial asthma and 
other respiratory diseases (Banner, K.H. and CP. Page (1995) Eur. Respir. J. 8:996-1000). 

20 G-Protein Signaling 

Guanine nucleotide binding proteins (G-proteins) are critical mediators of signal transduction 
between a particular class of extracellular receptors, the G-protein coupled receptors (GPCR), and 
intracellular second messengers such as cAMP and Ca^*. G-proteins are linked to the cytosolic side 
of a GPCR such that activation of the GPCR by ligand binding stimulates binding of the G-protein to 

25 GTP, inducing an "active" state in the G-protein. In the active state, the G-protein acts as a signal to 
trigger other events in the cell such as the increase of cAMP levels or the release of Ca^* into the 
cytosol from the ER, which, in turn, regulate phosphorylation and activation of other intracellular 
proteins. Recycling of the G-protein to the inactive state involves hydrolysis of the bound GTP to 
GDP by a GTPase activity in the G-protem. (See Alberts, B. et al. (1994) Molecular Bioloigv of the 

30 Cell , Garland Publishmg, Inc., New York ^fY, pp.734-759.) Two stmcturally distinct classes of G- 
proteins are recognized: heterotrimeric G-proteins, consisting of three different subunits, and 
monomeric, low molecular weight (LMW), G-proteins consisting of a single polypeptide chain. 

The three polypeptide subunits of heterotrimeric G-proteins are the a, P, and y subimits. The 
a subunit bmds and hydrolyzes GTP. The P and y subimits form a tight complex that anchors the 

35 protem to the iimer side of the plasma membrane. The P subunits, also known as G-P proteins or p 
transducins, contain seven tandem repeats of the WD-repeat sequence motif, a motif found in many 
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proSwSIJJiryfancdons. Mutations and vaxiantexpresri^ 
linked with various disorders (Neer3J.etal.(1994)Nature371:297.30^^ 

Mol. Cell 1:565-574). 

LMW GTP-proteins are GTPases which regulate cell gro^vth. cell cycle control, protein 
5 secretion, and intracellular vesicle interaction. TT^ey consist of single polypeptides which, like the a 
subunitoftheheterotrin.ericG-proteins. are able tobind and hydrolyzeOTP.thus cycling be^^^ 
inactive and an active state. At least sixty niembers of the mw G-protein superf^^ 

identified and are currentiy grouped into die six subfamiUes of ras, rho, arf. sari, ran, and rab. 
Activated ras genes were initiaUyfoundmhunm cancers, and subsequent studies confmned that ras 

,0 fanctioniscriticalixideternuningwhetherceUscontinuetogroworbecornediff^^^ Other 

niembers of the LMW G-protem superfemily have roles in signal transduction that vary wrfli flie 

function of tiie activated genes and the locations of the G-protems. 

Guanme nucleotide exchange factors regulate the activities of LMW G-pioteins by 
deternuningwheti.erGTPorGDPisbound.GTPase-activatingprotein(GAP)bindstoGTP-ras^ 
X5 inducesittohydrolyzeGTPtoGDP. M contrast, guanine nucleotide releasmg protem (GNRP) bmds 
to GDP-ras and mduces tiie release of GDP and tiie binding of GTP. 

Other regulators of G-protein signalmg (RGS) also exist that act primarily by negatively 
regulating tixe G-protein pathway by an unknown n^echanism (Druey. K.M. et al. (1996) Nature 
379-742-746). Some 15 members of the RGS family have been identified. RGS family members are 
20 related structuraUy tim^ugh sunilarities in an approxunately 120 amino acid region termed the RGS 
domain and functionally by their abihty to inhiMt ti»e mterleukin (cytokine) induction of MAP kmase 
in cultured mammaUan 293T cells (Dmey. m^- 
ralcimn Sieu alinp Molecules 

Ca*^ is anotiier second messenger molecule tiiat is even more widely used as an intracellular 
25 mediator than cAMP. Two patiiways existby which Ca- can enter the cytosol in response to 

extraceUular signals: One patiiway acts primarily in nerve signal transduction where Ca- enters a 
nerve terminal tim>ugh a voltage-gated Ca^ channel. The second is a more ubiquitous pathway m 
which Ca«is released from the ER into the cytosolin response to binding of an extracellular 

signaling molecule to a receptor. Ca- directly activates regulatory enzymes, such as protem kinase 
30 Qwhichtriggersignaltransductionpathways. Ca- also bmds to specific Ca--binding protems 
(CBPs) such as cahnodulm (CaM) which tiien activate multiple target proteins in the ceU mcludmg 
enzymes, membrane transport pumps, and ion channels. interactions are involved m a multitude 
of cellular processes includmg. but not limited to, gene regulation. DNA syntixesis. cell cycle 
progression, mitosis, cytokmesis, cytoskeletal organization, muscle contraction, signal transduction, 
ionhomeostasis. exocytosis. andmetabolic regulation (CeUo, M.R. et al. (1996) Msfeoto 
C^^,^^^,^^,^^, Oxford University Press, 15-20). Some CBPs can serve 
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as a storage depot for Ca^* in an inactive state. Calsequestrin is one such CBP that is expressed in 

isofonns specific to cardiac muscle and skeletal muscle. It is suggested that calsequestrin binds Ca^^ 

in a rapidly exchangeable state that is released during Ca^ -signaling conditions (Celio, M.R. et al. 

(1996) Guidebook to Calcium-binding Proteias. Oxford University Press, New Yo± NY, pp. 222- 

224). 

Cvclins 

Cell division is the fundamental process by which all living things grow and reproduce. In 
most organisms, the cell cycle consists of three principle steps; int^hase, mitosis, and cytokinesis. 
Interphase, involves preparations for cell division, replication of the DNA and production of essential 
proteins. In mitosis, the nuclear material is divided and separates to opposite sides of the cell. 
Cytokinesis is the final division and fission of the cell cytoplasm to produce the daughter cells. 

The entry and exit of a cell from mitosis is regulated by the synthesis and destruction of a 
family of activating proteins called cyclins. Cyclins act by binding to and activating a group of 
cyclin-dependent protein kinases (Cdks) which then phosphorylate and activate selected proteins 
involved in the mitotic process. Several types of cyclins exist. (Ciechanover, A. (1994) Cell 
79: 13-21.) Two principle types are mitotic cyclin, or cyclin B, which controls entry of the cell into 
mitosis, and Gl cyclin, which controls events that drive the cell out of mitosis. 
Signal Complex Scaffolding Proteins 

Ceretain proteins in intracellular signaling pathways serve to link or cluster other proteins 
involved in the signaling cascade. A conserved protein domain called the PDZ domain has been 
identified in various membrane-associated signaling proteins. This domain has been unplicated in 
receptor and ion channel clustering and in the targeting of multiprotein signaling complexes to 
specialized functional regions of the cytosolic face of the plasma membrane. (For a review of PDZ 
domain-containing proteins, see Pouting, CP. et al. (1997) Bioessays 19:469-479.) A large 
proportion of PDZ domains are found in the eukaryotic MAGUK (membrane-associated guanylate 
kinase) protein family, members of which bind to the intracellular domains of receptors and channels. 
However, PDZ domains are also found in diverse membrane-localized proteins such as protein 
tyrosine phosphatases, serine/threonine kinases, G-protein cofactors, and synapse-associated proteins 
such as syntrophins and neuronal nitric oxide synthase (nNOS). Generally, about one to three PDZ 
domains are found in a given protein, although up to nine PDZ domains have been identified in a 
single protein. 

Membrane Transport Molecules 

The plasma membrane acts as a barrier to most molecules. Transport between the cytoplasm 
and the extracellular environment, and between the cytoplasm and lumenal spaces of cellular 
organelles requires specific transport proteins. Each transport protein carries a particular class of 
molecule, such as ions, sugars, or amino acids, and often is specific to a certain molecular species of 
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TM6 and TM7, and play a critical role in maintaining intracellular pH by removing the protons that 

are produced stoichiometdcally with lactate during glycolysis. The best characterized 

H(+)-monocarboxylate transporter is that of the erythrocyte membrane, which transports L-lactate 

and a wide range of other aliphatic monocarboxylates. Other cells possess H(+)-linked 

5 monocarboxylate transporters with differing substrate and inhibitor selectivities. In particular, 

cardiac muscle and tumor cells have transporters that differ in their K^^ values for certain substrates, 

including stereoselectivity for L- over D-lactate, and in their sensitivity to inhibitors. There are 

Na(+)>monocarboxylate cotransporters on the luminal surface of intestinal and kidney epithelia, 

which allow the uptake of lactate, pyruvate, and ketone bodies in these tissues. In addition, there are 

10 specific and selective transporters for organic cations and organic anions in organs including the 
kidney, intestine and liver. Organic anion transporters are selective for hydrophobic, charged 
molecules with electron-attracting side groups. Organic cation transporters, such as the ammonium 
transporter, mediate the secretion of a variety of drugs and endogenous metabolites, and contribute to 
the maintenance of intercellular pH. (Poole, R.C. and A.P. Halestrap (1993) Am. J. Physiol. 

15 264:C761-C782; Price, N.T. et al. (1998) Biochem. J. 329:321-328; and Martinelle, K. and I. 
Haggstrom (1993) J. Biotechnol. 30: 339-350.) 

The largest and most diverse family of transport proteins Jcnown is the ATP-binding cassette 
(ABQ transporters. ABC transporters are also called the "traffic ATPases" comprising a superfamily 
of membrane proteins that mediate transport and channel functions in prokaryotes and eukaryotes 

20 (Higgins, C.F. (1992) Annu. Rev. Cell Biol. 8:67-113). ABC proteins sharis a similar overall 
structure and significant sequence homology. All ABC proteins contain a conserved domain of 
approximately two hundred amino acid residues which includes one or more nucleotide binding 
domains. ABC proteins consist of four modules: two nucleotide-binding domains (NBD), which 
hydrolyze ATP to supply the energy required for transport, and two membrane-spanning domains 

25 (MSD), each containing six putative transmembrane segments. These four modules may be encoded 
by a single gene, as is the case for the cystic fibrosis transmembrane regulator (CFTR), or by separate 
genes. When encoded by separate genes, each gene product contains a single NBD and MSD. These 
"half-molecules" form homo- and heterodimers, such as Tapl and Tap2, the endoplasmic reticulum- 
based major histocompatibility (MHC) peptide transport system. As a fanaily, ABC transporters can 

3 0 transport substances that differ markedly in chemical structure and size, ranging fi-om small 
molecules such as ions, sugars, amino acids, peptides, and phospholipids, to lipopeptides, large 
proteins, and con^lex hydrophobic drugs. Mutations in ABC transporter genes are associated with 
various disorders, such as hyperbilirubinemia II/Dubin- Johnson syndrome, recessive Stargardt's 
disease, and ceUac disease. Several genetic diseases are attributed to defects in ABC transporters, 

35 such as the following diseases and their corresponding proteins: cystic fibrosis (CFTR, an ion 
channel), X-linked adrenoleukodystrophy (adrenoleukodystrophy protein, ALDP), Zellweger 
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transporters, including Na*-K* ATPase, Ca^+-ATPase, and H*-ATPase, are activated by a 

phosphorylation event P-class ion transporters are responsible for maintaining resting potential 

distributions such that cytosolic concentrations of Na*^ and Ca^^ are low and cytosolic concentration 

of is high. The vacuolar (V) class of ion transporters includes puiqps on intracellular 

5 organelles, such as lysosomes and Golgi. V-class ion transporters are responsible for generating the 

low pH within the lumen of these organelles that is required for function. The coupling factor (F) 

class consists of IT pumps in the mitochondria. F-cIass ion transporters utilize a proton gradient to 

generate ATP from ADP and inorganic phosphate (Pj). 

The restmg potential of the cell is utilized in many processes involving carrier proteins and 

10 gated ion channels. Carrier proteins utilize the resting potential to transport molecules into and out of 
the cell. Amino acid and glucose transport into many cells is linked to sodium ion co-transport 
(symport) so that the movement of Na* down an electrochemical gradient drives transport of the other 
molecule up a concentration gradient. Similarly, cardiac muscle links transfer of Ca^* out of the cell 
with transport of Na"^ into the cell (antiport). 

15 Ion channels share common structural and mechanistic themes. The channel consists of four 

or five subunits or protein monomers that are arranged like a barrel in the plasma membrane. Each 
subunit typically consists of six potential transmembrane segments (SI, S2, S3, S4, S5, and S6). The 
center of the barrel forms a pore lined by a-helices or p-strands. The side chains of the amino acid 
residues comprising the a-helices or |}-strands establish the charge (cation or anion) selectivity of the 

20 channel. The degree of selectivity, or what specific ions are allowed to pass through the channel, 
depends on the diameter of the narrowest part of the pore. 

Gated ion channels control ion flow by regulating the opening and closing of pores. These 
channels are categorized according to the manner of regulating the gating function. Mechanically- 
gated chaimels open pores in response to mechanical stress, voltage-gated channels open pores in 

25 response to changes in membrane potential, and ligand-gated channels open pores in the presence of a 
specific ion, nucleotide, or neurotransmitter. 

Voltage-gated Na^ and K"^ channels are necessary for the function of electrically excitable 
cells, such as nerve and muscle cells. Action potentials, which lead to neurotransmitter release and 
muscle contraction, arise from large, transient changes in the permeability of the membrane to Na* 

30 and ions. Depolarization of the membrane beyond the threshold level opens voltage-gated Na^ 
channels. Sodium ions flow into the cell, further depolarizing the membrane and opening more 
voltage-gated Na"^ channels, which propagates the depolarization down the length of the cell. 
Depolarization also opens voltage-gated potassium channels. Consequently, potassium ions flow 
outward, which leads to repolarization of the membrane. Voltage-gated channels utilize charged 

35 residues in the fourth transmembrane segment (S4) to sense voltage change. The open state lasts only 
about 1 millisecond, at which time the channel spontaneously converts into an inactive state that 
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small molecules such as biogenic amines in chromafBn granules, processing of vacuolar constituents 

such as pro-hormones by proteolytic enzymes, and protein degradation in lysosomes (Al-Awqati, 

supra) . 

Ligand-gated channels open their pores when an extracellular or intracellular mediator binds 
5 to the channel. Neuiotransmitter-gated channels are channels that open when a neurotransmitter 
binds to their extracellular domain. These channels exist in the postsynaptic membrane of nerve or 
muscle cells. There are two types of neurotransmitter-gated channels. Sodium channels open in 
response to excitatory neurotransmitters, such as acetylcholine, glutamate, and serotonin. This 
opening causes an influx of Na ^ and produces the iiutial localized depolarization that activates the 

10 voltage-gated channels and starts the action potential. Chloride channels open in response to 
inhibitory neurotransmitters, such as y-aminobutyric acid (GABA) and glycine, leading to 
hypexpolarization of the membrane and the subsequent generation of an action potential. 

Ligand-gated channels can be regulated by intracellular second messengers. Calcium- 
activated K * channels are gated by internal calcium ions. In nerve cells, an mflux of calcium during 

15 depolarization opens K* channels to modulate the magnitude of the action potential (Ishi, T.M. et al. 
(1997) Proc. Natl. Acad. Sci. USA 94:11651-11656). Cyclic nucleotide-gated (CNG) channels are 
gated by cytosolic cyclic nucleotides. The best examples of these are the cAMP-gated Na * channels 
involved in olfaction and the cGMP-gated cation channels mvolved in vision. Both systems involve 
ligand-mediated activation of a G-protein coupled receptor which then alters the level of cyclic 

20 nucleotide within the cell. 

Ion channels are expressed in a number of tissues where they are implicated in a variety of 
processes. CNG channels, while abimdantly expressed in photoreceptor and olfactory sensory cells, 
are also found in kidney, lung, pineal, retinal ganglion cells, testis, aorta, and brain. Calcium- 
activated K'^ channels may be responsible for the vasodilatory effects of bradykinin in the kidney and 

25 for shunting excess K ^ from brain capillary endothelial cells into the blood. They are also implicated 
in repolarizing granulocytes after agonist-stimulated depolarization (Ishi, supra) . Ion channels have 
been the target for many drug therapies. Neurotransmitter-gated channels have been targeted in 
therapies for treatment of insomnia, anxiety, depression, and schizophrenia. Voltage-gated channels 
have been targeted in therapies for arrhythmia, ischemic stroke, head trauma, and neurodegenerative 

3 0 disease (Taylor, CP. and L.S. Narasimhan (1997) Adv. Pharmacol. 39:47-98). 
Disease Correlation 

The etiology of numerous human diseases and disorders can be attributed to defects in the 
transport of molecules across membranes. Defects in the trafficking of membrane-bound transporters 
and ion channels are associated with several disorders, e.g. cystic fibrosis, glucose-galactose 
35 malabsorption syndrome, hypercholesterolemia, von Gierke disease, and certam forms of diabetes 
mellitus. Single-gene defect diseases resulting in an inability to transport small molecules across 
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cytosoUc calcium activated proteases, calpains. CPs are produced by monocytes, macrophages and 

other cells of the immune system which migrate to sites of inflammation and secrete molecules 

involved in tissue repair. Overabundance of these repair molecules plays a role in certain disorders. 

In autoimmune diseases such as rheumatoid arthritis, secretion of the cysteine peptidase cathepsin C 

5 degrades collagen, laminin, elastin and other structural proteins found in the extracellular matrix of 

bones. 

Aspartic proteases are members of the cathepsin family of lysosomal proteases and include 
pepsin A, gastricsin, chymosin, renin, and cathepsins D and E. Aspartic proteases have a pair of 
aspartic acid residues in the active site, and are most active in the pH 2 - 3 range, m which one of the 

10 aspartate residues is ionized, the other un-ionized. Aspartic proteases include bacterial 

peniciUopepsin, mammalian pepsin, renin, chymosin, and certain fungal proteases. Abnormal 
regulation and expression of cathepsins is evident in various inflammatory disease states. In cells 
isolated from inflamed synovia, the mRNA for stromelysin, cytokines, TIMP-1, ca±epsin, gelatinase, 
and other molecules is preferentially expressed.**Expression of cathepsins L and D is elevated in 

15 synovial tissues from patients with rheumatoid arthritis and osteoarthritis. Cathepsin L expression 
may also contribute to the mflux of mononuclear cells which exacerbates the destruction of the 
rheumatoid synovium. (Keyszer, G.M. (1995) Arthritis Rheum. 38:976-984.) The increased 
expression and differential regulation of the cathepsins are linked to the metastatic potential of a 
variety of cancers and as such are of therapeutic and prognostic interest (Chambers, A.F. et al. (1993) 

20 Crit Rev. Oncog. 4:95-1 14). 

Metalloproteases have active sites that include two glutamic acid residues and one histidine 
residue that serve as binding sites for zinc. Carboxypeptidases A and B are the principal mammalian 
metalloproteases. Both are exoproteases of similar stmcture and active sites. Carboxypeptidase A, 
like chymotrypsin, prefers C-terminal aromatic and aliphatic side chains of hydrophobic nature, 

25 whereas carboxypeptidase B is directed toward basic arginine and lysine residues. Glycoprotease 
(GCP), or 0-sialoglycoprotein endopeptidase, is a metallopeptidase which specifically cleaves 
O-sialoglycoproteins such as glycophorin A. Another metallopeptidase, placental leucme 
aminopeptidase (P-LAP) degrades several peptide hormones such as oxytocin and vasopressin, 
suggesting a role in maintaining homeostasis during pregnancy, and is expressed in several tissues 

30 (Rogi, T. et al. (1996) J. Biol. Chem. 271:56-61). 

Ubiquitin proteases are associated with the ubiquitin conjugation system (UCS), a major 
pathway for the degradation of cellular proteins ia eukaryotic cells and some bacteria. The UCS 
mediates the elimination of abnormal proteins and regulates the half-lives of important regulatory 
proteins that control cellular processes such as gene transcription and cell cycle progression. In the 

3 5 UCS pathway, proteins targeted for degradation are conjugated to a ubiquitin, a small heat stable 
protein. The ubiquitinated protein is then recognized and degraded by proteasome, a large, 
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ER where post-translational modificattons are performed. These moom 
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folding and the fonnation of disulfide bonds, and N-linked glycosylations. 

Protein Isomerases 

Protein folding in the ER is aided by two principal types of protein isomerases, protein 
disulfide isomerase (PDI), and peptidyl-prolyl isonoierase (PPI). PDI catalyzes the oxidation of firee 
5 sulfliydryl groups in cysteine residues to form intramolecular disulfide bonds m proteins. PPI, an 
enzyme that catalyzes the isomerization of certain proline imidic bonds in oligopeptides and proteins, 
is considered to govern one of the rate luniting steps in the folding of many protems to their final 
functional conformation. The cyclophilins represent a major class of PPI that was originally 
identified as the major receptor for the immunosuppressive drug cyclosporin A (Handschumacher, 

10 R.E. et al. (1984) Science 226: 544-547). 
Protein Glvcosvlation 

The glycosylation of most soluble secreted and membrane-bound proteins by 
oligosaccharides linked to asparagine residues in proteins is also performed in the ER. This reaction 
is catalyzed by a membrane-bound enzyme, oligosaccharyl transferase. Although the exact purpose 

15 of this "N-linked" glycosylation is unknown, the presence of oligosaccharides tends to make a 
glycoprotein resistant to protease digestion. In addition, oligosaccharides attached to cell-surface 
proteins called selectins are known to function in cell-cell adhesion processes (Alberts, B. et al. 
(1994) Molecular Biology of the Cell, Garland PubUshing Co., New York NY, p.608). "O-linked" 
glycosylation of proteins also occurs in the ER by the addition of N-acetylgalactosamine to the 

20 hydroxyl group of a serine or threonine residue followed by the sequential addition of other sugar 
residues to the first This process is catalysed by a series of glycosyltransferases each specific for a 
particular donor sugar nucleotide and acceptor molecule (Lodish, H. et al. (1995) Molecular Cell 
Biology , W.H. Freeman and Co., New York NY, pp.700-708). In many cases, both N- and O-linked 
oligosaccharides appear to be required for the secretion of proteins or the movement of plasma 

25 membrane glycoproteins to the cell surface. 

An additional glycosylation mechanism operates in the ER specifically to target lysosomal 
enzymes to lysosomes and prevent their secretion. Lysosomal enzymes in the ER receive an N- 
linked oligosaccharide, like plasma membrane and secreted proteins, but are then phosphorylated on 
one or two mannose residues. Hie phosphorylation of maimose residues occurs in two steps, the first 

3 0 step being the addition of an N-acetylglucosamine phosphate residue by N-acetylglucosamine 
phosphotransferase, and the second die removal of the N-acetylglucosamine group by 
phosphodiesterase. The phosphorylated mannose residue then targets the lysosomal enzyme to a 
mannose 6-phosphate receptor which transports it to a lysosome vesicle (Lodish, supra> pp. 708-711). 
Chaperones 

3 5 Molecular chaperones are proteins that aid in the proper folding of immature proteins and 

refolding of improperly folded ones, the assembly of protein subunits, and in the transport of 
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polymerizatioii proceeds in a 3* to 3' direction by addition of a ribonucleoside monophosphate to tiie 

3'"OH end of a growing RNA chain. Transcription of DNA into RNA takes place in the nucleus and 

is catalyzed by RNA polymerases. DNA transcription generates messenger RNAs (mRNA) that 

carry information for protein synthesis, as well as the transfer, ribosomal, and other RNAs that have 

structural or catalytic functions. Three types of RNA polymerase exist (Alberts, supra , pp. 367-368). 

RNA polymerase I makes tixt large ribosomal RNAs, RNA polymerase H makes die mRNAs that will 

be translated into proteins, and RNA polymerase IH makes a variety of small, stable RNAs, including 

5S ribosomal RNA and flie transfer RNAs (tRNA). In all cases, RNA synthesis is mitiated by 

binding of the RNA polymerase to a promoter region on die DNA and synthesis begins at a start site 

wilhin the promoter. Synthesis is completed at a broad, general stop or termination region in the 

DNA where botii the polymerase and the completed RNA chain are released. The primary transcript 

of RNA polymerase II is called heterogenous nuclear RNA (hnRNA), and must be further processed 

by splicing to remove non-<:oding sequences called introns. RNA splicing is mediated by small 

nuclear ribonucleoprotem complexes, or snRNPs, producing mature messenger RNA (mRNA) which 

is then transported out of the nucleus for translation into proteins. 

Lipases 

DNA repair is the process by which accidental base changes, such as those produced by 
oxidative damage, hydrolytic attack, or uncontrolled methylation of DNA are corrected before 
replication or transcription of the DNA can occur. Because of the efficiency of the DNA repak 
process, fewer than one in one thousand accidental base changes causes a mutation (Alberts, supra , 
pp. 245-249). The three steps common to most types of DNA repair are (1) excision of the damaged 
or altered base or nucleotide by DNA nucleases, leaving a gap; (2) insertion of the correct nucleotide 
in this gap by DNA polymerase using the complementary strand as the template; and (3) sealing the 
break left between the inserted nucleotide(s) and the existing DNA strand by DNA ligase. In the last 
reaction, DNA ligase uses the energy from ATP hydrolysis to activate die 5' end of the broken 
phosphodiester bond before forming the new bond witii the 3 -OH of die DNA strand. In Bloom*s 
syndrome, an inherited human disease, mdividuals are partially deficient in DNA ligation and 
consequentiy have an increased incidence of cancer (Alberts, supra, p. 247). ' 
Nucleases 

Nucleases conq)rise both enzymes that hydrolyze DNA (DNase) and RNA (RNase). They 
serve different purposes in nucleic acid metabolism. Nucleases hydrolyze the phosphodiester bonds 
between adjacent nucleotides either at internal positions (endonucleases) or at the terminal 3' or 5' 
nucleotide positions (exonucleases). A DNA exonuclease activity in DNA polymerase, for example, 
serves to remove improperly paired nucleotides attached to die 3'-OH end of the growing DNA strand 
by the polymerase and thereby serves a "proofireadtag" function. As mentioned above, DNA 
endonuclease activity is involved in the excision step of the DNA repair process. 
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^°&Te?Sserveavarietyoffancti(ms.Forexan,.le,W^^ 
enzyxne which cleaves the 5' endofpre-tRNAs as part of their maturation process. FNaseHdrgests 
the RNAstrand of an RNA/DNA hybrid Such hyMds occur incellsinvadedbyretrov^^^^ 

imaseHis an irnportantenzymeintheretroviralreplicationcycle. Pancreatic 
panaeas into the intestinehydrolyzesRNApresentmmgestedfoods-RNaseactiv^^ 

ceU extracts is elevated m a variety of cancers and infectious diseases (Schein. C.H. (1997) Nat 
Biotechnol. 15:529-536). Regulation of RNase activity is being investigated as a means to control 
tumor angiogenesis,anergicreactions, viral infection and«pHcation,a^^ 

Methvlases 

Me*ylation of specific nucleotides occurs in both DN A andRNA. and serves different 
f„x,ctionsmthetwomacromolecules.Meti.yktionofcytosineresiduestoform5.me*^^ 
DNA occurs specifically at CX3 sequences which are base-paired with one another m the DNA 
double-helix, -ms pattern of methylation is passed ftom generation to generation durmg DNA 
repUcationby an enzyme caUedWenancemetiiylase-tiiatactsprefereBtially on those CG 

sequences that arebase-pairedwithaCG sequence that is already methylated. Suchmet^^^^^^^^ 
appears to distingmshactiveftominactivegenesby preventing thebinding of regulatory pr 

"turn on" the gene, but permit ti,e binding of proteins that inactivate the gene (Alberts. s_m^, PP. 448- 
45 1) m RNA metabolism. "^NA methylase" produces one of several nucleotide modifications m 
tRNA that affect ti.e conformation and base-pairing of the molecule and facilitate tiierecogniuon of 

theappropriatemRNAcodonsbyspecifictRNAs. IHe primary methylation pattern xs the 
dimethylation of guanine residues to form N.N-dimetiiyI guanine. 
u.lira«r..s and Si^ fi^-'^h^^deA Binding Proteins , 

Helicases are enzymes tiiat destabHize and unwind double helix structures in botii DNA and 
RNA Since DNA repUcation occurs more or less simultaneously on both strands, ti.e two strands 
must first separate to generateareplication "fork" forDNA polymerase to act on. Two types of 

mplication proteins contribute to this process. DNA helicases and single-stranded binding protems. 
DNAheHcaseshydrolyzeATT and use the energy of hydrolysis to separate tixeDNA strands. Smgle- 

stranded binding proteins (SSBs) then bind to the exposed DNA strands without covering ti.e bases 
thereby temporarily stabUizing ti.em for templating by the DNA polymerase (Alberts, mm, PP- 255- 

RNA heUcases also alter and regulate RNA conformation and secondary structure. Like the 
DNAheUcases.RNAhelicases utilize energy derivedfromAIPhydrolyds to destabilize andunwind 

RNA duplexes. Themostwell-ch^cterizedandubiquitousfamUyofRNAheUcases is theDEAD- 

boxfamily.sonamedfor the conservedB-typeAIP-bindingmotif which is diagnostic of protemsm 

35 ftisfamily. Over40DEAD-boxhelicaseshavebeenidentifiedinorganismsasdiverseasbacterta, 

insects yeast, amphibians, mammals, andplants. DEAD-box helicases function in diverse processes 
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such as translation initiation, splicing, ribosome assembly, and RNA editing, transport, and stability. 

Son^ DEAD-box helicases play tissue- and stage-specific roles in spermatogenesis and 

embiyogenesis. Overexpression of the DEAD-box 1 protein (DDXl) may play a role in the 

progression of neuroblastoma (Nb) and retinoblastoma (Rb) tumors (Godbout, R. et al. (1998) J. Biol. 

5 Chem. 273:21 161-21 168). These observations suggest that DDXl may promote or enhance tumor 

progression by altering the normal secondary structure and expression levels of RNA in cancer cells. 

Other DEAD-box helicases have been implicated either directiy or indirectiy ia tumorigenesis 

(Discussed in Godbout, supra) . For example, murine p68 is mutated in ultzaviolet light-induced 

tumors, and human DDX6 is located at a chromosomal breakpoint associated with B-cell lymphoma. 

10 Similarly, a chimeric protein composed of DDXIO and NUP98. a nucleoporin protein, may be 

involved in the pathogenesis of certain myeloid malignancies. 

Topoisomerases 

Besides the need to separate DNA strands prior to replication, the two strands must be 
"unwound" from one another prior to their separation by DNA helicases. This function is performed 

15 by proteins known as DNA topoisomerases. Topoisomerases are enzymes that affect the topological 
state of DNA. For example, defects in topoisomerases or their regulation can affect normal 
physiology. DNA topoisomerase effectively acts as a reversible nuclease that hydrolyzes a 
phosphodiesterase bond in a DNA strand, permitting the two strands to rotate fteely about one 
another to remove the strain of the helix, and then rejoins the original phosphodiester bond between 

20 the two strands. Two types of DNA topoisomerase exist, types I and 11. DNA Topoisomerase I 

causes a single-strand break in a DNA helix to allow the rotation of the two strands of the helix about 
the remaining phosphodiester bond in the opposite strand. DNA topoisomerase 11 causes a transient 
break in both strands of a DNA helix where two double helices cross over one another. This type of 
topoisomerase can efFicienfly separate two interlocked DNA circles (Alberts, supra , pp.260-262). 

25 Type II topoisomerases are largely confined to proliferating cells in eukaryotes, such as cancer cells. 
For this reason they are targets for anticancer drugs. Topoisomerase n has been implicated in multi- 
drug resistance (MDk) as it appears to aid in the repair of DNA damage inflicted by DNA bmding 
agents such as doxorubicm and vincristine. Reduced levels of topoisomerase 11 have been correlated 
with some of the DNA processmg defects associated with the disorder ataxia-telangiectasia (Smgh, 

30 S.P. et al. (1988) Nucleic Acids Res. 16:3919-3929). 
Recombinases 

Genetic recombination is the process of rearranging DNA sequences within an organism's 
genome to provide genetic variation for the organism in response to changes in the environment, 
DNA recombmation allows variation m the particular combmation of genes present in an individual's 
35 genome, as well as the timing and level of expression of these genes (see Alberts, supra , pp. 263- 
273). Two broad classes of genetic recombination are conmionly recognized, general recombmation 
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protocacUienns. Classical cadhenns include the E-cadherin, N-cadherm, and P-<:adiierm subtamilies. 

E-cadherin is present on many types of epithelial cells and is especially important for emhiyonic 

development. N-cadherin is present on nerve, muscle, and lens cells and is also critical for 

embryonic development P-cadheiin is present on cells of the placenta and epidermis. Recent studies 

5 report that protocadherins are involved in a variety of cellK:ell mteractions (Suzuki, S.T. (1996) J. 

Cell Sci. 109:2609-261 1). The intracellular anchorage of cadherins is regulated by their dynamic 

association with catenins, a family of cytoplasmic signal transduction proteins associated with the 

actin cytoskeleton. The anchorage of cadherins to the actin cytoskeleton appears to be regulated by 

protein tyrosine phosphorylation, and the cadherins are the target of phosphorylation-mduced 

10 junctional disassembly (Aberle, H. et al. (1996) J. Cell. Biochem. 61:514-523). 

Ihtegrms 

Integrins are ubiquitous transmembrane adhesion molecules that link the ECM to the internal 
cytoskeleton. Integrins are conQ)osed of two noncovalently associated transmembrane glycoprotein 
subunits called a and p. Integrins function as receptors that play a role in signal transduction. For 

15 example, binding of integrin to its extracellular ligand may stimulate changes in intracellular calcium 
levels or protein kinase activity (Sjaastad, M.D. and W.J. Nelson (1997) BioBssays 19:47-55). At 
least ten cell surface receptors of the integrin family recognize the ECM component fibronectin, 
which is involved in many different biological processes including cell migration and embryogenesis. 
(Johansson, S. et al. (1997) Front. Biosci. 2:D126-D146). 

20 Lectins 

Lectins comprise a ubiquitous family of extracellular glycoproteins which bind cell surface 
carbohydrates specifically and reversibly, resulting in the agglutination of cells (reviewed in 
Drickamer, K. and M.E. Taylor (1993) Annu. Rev. Cell Biol. 9:237-264). This function is 
particularly important for activation of the inmune response. Lectins mediate the agglutination and 

25 raitogenic stimulation of lymphocytes at sites of inflammation (Lasky, L.A. (1991) J. Cell. Biochem. 
45: 139446; Paietta, E. et al. (1989) J. Immunol. 143:2850-2857). 

Lectins are further classified into subfamilies based on carbohydrate-binding specificity and 
other criteria. The galectin subfamily, in particular, includes lectins that bind P-galactoside 
carbohydrate moieties in a thiol-dependent manner (reviewed in Hadari, Y.R. et al. (1998) J. Biol. 

30 Chem. 270:3447-3453). Galectins are widely expressed and developmentally regulated. Because all 
galectins lack an N-terminal signal peptide, it is suggested that galectins are externalized through an 
atypical secretory mechanism. Two classes of galectins have been defined based on molecular 
weight and oligomerization properties. Small galectins form homodimers and are about 14 to 16 
kilodaltons in mass, while large galectins are monomeric and about 29-37 kilodaltons. 

35 Gralectins contain a characteristic carbohydrate recognition domain (CRD). The CRD is 

about 140 amino acids and contains several stretches of about 1-10 amino acids which are highly 
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con»rS'^galectiBs.Apaxticular6-anunoaddrrK,tifwi^ 
tryptophanandargininexesidueswhicharecriticdfc^caxbohyd^ TteCRDofsoxne 
galectins also contains cysteineresidues which may be in^t for d«^^^ 
Secondary structurepredictions indicate that the CRD forms several P-sheets. 

Galectins play a number of roles in diseases and conditions associated with cell-cell and cell- 
xnatrix interactions. For example, certain galectins associate with sites of inflammati» 
ceU sur&ce immunoglobulin E molecules. In addition, galectins may play an important role m 
cancer metastasis. Galectin overexpression is correlated with the metastatic potential of cancm m 
humans and mice. Moreover, anti-galectin antibodies inhMt processes associated with ceU 
transformation, such as ceU aggregation and anchorage-independent growth (See. for exanq,le. Su. 
Z.-Z. et al. (1996) Proc. Natl. Acad. ScL USA 93:7252-7257). 
Selectins 

Selectins. or LEC-CAMs. comprise a specialized lectin subfamily involved primarily in 
irflanunation and leukocyte adhesion (Reviewed in Lasky . su^. Selectins mediate the recruitment 
of leukocytes from the circulation to sites of acute inflammation and are expressed on the surface of 
vascular endotheUal cells in response to cytokme signalmg. Selectins bind to specific Ugands on the 
leukocyte cell membrane and enable the leukocyte to adhere to and migiute along the endothelial 
surface Bmding of selectin to its ligand leads to polarized rearrangement of the actin cytoskeleton 
and stimulates signal transduction within the leukocyte (Brenner. B. et al. (1997) Biochem. Biophys. 
, Res. Commun. 231:802-807; Hidari. K.I. et al. (1997) J. Biol. Chem. 272:28750-28756). Members 
of the selectin family possess three characteristic motifs: a lectin or carbohydrate recogmtaon 
domain; an epidermal growth factor-like domain; andavariablenumber of short consensus repeats 

(scr or "sushi" repeats) which are also present in complement regulatory proteins. The selectins 
include lymphocyte adhesion molecule-1 (Lam-1 or L-selectin). endothelial leukocyte adhesxon 
5 molecule-1 (ELAM-1 or E-selectin). and granule membrane protem-140 (GMP-140 or P-selectin) 
(Johnston. G.I. et al. (1989) CeD 56:1033-1044). 
Secreted and Extracellular Matrix Molecules 

Protein transport and secretion are essential for cellular function. Protem transport and 
secretion are mediatedbyasignal peptide located at the amino terminus of theproteintobe secret 

, 0 The signal peptide is composed of about ten to twenty hydrophobic ammo acids which target the 
nascent protein from the ribosome to ti.e endoplasmic reticd«m(ER).Protems targeted to theER 
n^y either proceed ti^ough the secretory pathway or remam in any of the secretory organeUes such as 
theER Golgi apparatus, or lysosomes. Protems that transit through the secretory pathway are either 
secreted mto the extraceUular space or retamedmtiieplasmamembrane.Protems that arereu^ 

35 theplasmamembranecontamoneormoretransmembranedomains.eachcomprisedofabout20 
hydrophobic amino acid residues. Secretedproteins are often synthesized as inactive p 
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are activated by post-translational processing events during transit through the secretory pathway. 

Such events include glycosylation, proteolysis, and removal of the signal peptide by a signal 

peptidase. Other events that may occur during protein transport include chaperone-dependent 

unfolding and folding of the nascent protein and interaction of the protem with a receptor or pore 

5 complex. Exanq)les of secreted proteins with amino terminal signal peptides include receptors, 

extracellular noatrix molecules, cytokines, hormones, growth and differentiation factors, 

neuropeptides, vasomediators, ion channels, transporters/pumps, and proteases. (Reviewed in 

Alberts, B. et al. (1994) Molecular Biology of The Cell . Garland Publishing, New York NY, pp. 557- 

560, 582-592.) 

10 The extracellular matrix (ECM) is a complex network of glycoproteins, polysaccharides, 

proteoglycans, and other macromolecules that are secreted fix)m the cell into the extracellular space. 
The ECM remains in close association with the ceU surface and provides a supportive meshwork that 
profoundly influences cell shape, motility, strength, flexibility, and adhesion. In fact, adhesion of a 
cell to its surroimding matrix is required for cell survival except in the case of metastatic tumor cells, 

15 which have overcome the need for cell-ECM anchorage. This phenomenon suggests that the ECM 
plays a critical role in the molecular mechanisntis of growth control and metastasis. (Reviewed in 
Ruoslahti, E. (1996) Sci. Am. 275:72-77.) Furtheimore, the ECM determines the structure and 
physical properties of connective tissue and is particularly important for morphogenesis and other 
processes associated with embryonic development and pattern formation. 

20 The collagens comprise a family of ECM proteins that provide structure to bone, teeth, skin, 

ligaments, tendons, cartilage, blood vessels, and basement membranes. Multiple collagen proteins 
have been identified. Tliree collagen molecules fold together in a triple helix stabilized by interchain 
disulfide bonds. Bundles of these triple helices then associate to form fibrils. Collagen primary 
structure consists of hundreds of (Gly-X-Y) repeats where about a tliird of the X and Y residues are 

25 Pro. Glycines are crucial to helix formation as the bulkier amino acid sidechains cannot fold into the 
triple helical conformation. Because of these strict sequence requirements, mutations in collagen 
genes have severe consequences. Osteogenesis imperfecta patients have brittle bones that fracture 
easily; in severe cases patients die in utero or at birth. Ehlers-Danlos syndrome patients have 
hyperelastic skin, hypermobile joints, and susceptibility to aortic and intestinal rupture. 

30 Chondrodysplasia patients have short stature and ocular disorders. Alport syndrome patients have 
hematuria, sensorineural deafiiess, and eye lens deformation. (Isselbacher, K.J. et al. (1994) 
Harrison's Principles of Internal Medicine, McGraw-Hill, Inc., New York NY, pp. 2105-2117; and 
Creighton, T.E. (1984) Proteins> Structures and Molecular Principles . W.H. Freeman and Company, 
New York NY, pp. 191-197.) 

35 Elastin and related proteins confer elasticity to tissues such as skin, blood vessels, and lungs. 

Elastin is a highly hydrophobic protein of about 750 amino acids that is rich in proline and glycine 
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fibe. and ElasU. fiber. «e«™d^ty.^»^rfMaofibdl. which are of a 
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^» diapla, e^easive rnorphologioal defect. incl»diBg defects m d,e fonrntion of the 
,«««hord. I^a,., HooO vessels, -e.,.. hrbe, arrf e,.ra»nbr,e,ue s«c.»n=s. (Revtewed .B 

Alberts, saera. pp. 986-987.) 

Lanumnisaxr^jorglycoproteincon^onentofthebasaUannnaw 

.0 epitheUalceUsheets. l^nin is one of the first ECM proteins synthesi^d in the developing e^^^^^ 
Laxninin is an 850 kilodalton protein composed of three polypeptide chains joined in the shape of a 
cross by disulfide bonds. Lannnin is especiaUy important for angiogenesis and in partxcular. for 
guidingtheformationofcapmaries. (Reviewed in Alberts, supra, pp. 990-991.) 

There are many other types of proteinaceous ECM components, most of which can be 
2S classifiedasproteoglycans. Proteoglycans .e composed of unbranched polysaccharide ch^ns 
(glycosaminoglycans) attached to protein cores. Common proteoglyc^s include aggrecan 
beIglycan.decorin.perlecan,serglyc^.andsyndecan.l.Son.ofthesemol.^^^ 
rnechanical support, but also bind to e.tracenularsignalingmolecules.suchasfi^^^^^^^^^ 
factor and transforming growthfactor P. suggestingarole for proteoglycans in cell^ell 

30 conHnunicationandceUgrowth.(ReviewedinAlberts.^pp.973.978.)Lilcewis^^ 
glycoproteins tenascin-C and tenascin-Rareexpressedindevelopingandl.^^^ 
provide stimulatory and anti-adhesive (inhibitory) properties, respectively, for axonal growth. 
(Faissner. A. (1997) CeU Tissue Res. 290:331-341.) 

Cvtoskeletal Molecules 
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organelles and other elements move in the cytosoL The cytoskeleton is a dynamic structure that 

allows cells to adopt various shapes and to carry out directed movements. Major cytoskeletal fibers 

include the microtubules, the microfilaments, and the intermediate filaments. Motor proteins, 

including myosin, dynein, and kinesin, drive movement of or along the fibers. The motor protein 

5 dynamin drives the formation of membrane vesicles. Accessory or associated proteins modify the 

structure or activity of the fibers while cytoskeletal membrane anchors connect the fibers to the cell 

membrane. 

Tubulins 

Microtubules, cytoskeletal fibers with a diameter of about 24 nm, have multiple roles in the 

10 cell. Bundles of microtubules form cilia and flageUa, which are whip-like extensions of the cell 
membrane that are necessary for sweeping materials across an epithelium and for swinoming of 
spemi, respectively. Marginal bands of microtubules in red blood cells and platelets are important 
for these cells* pliability. Organelles, membrane vesicles, and proteins are transported in the cell 
along tracks of microtubules. For example, microtubules run through nerve cell axons, allowing bi- 

15 directional transport of materials and membrane vesicles between the cell body and the nerve 

terminal. Failure to supply the nerve terminal with these vesicles blocks the transmission of neural 
signals. Microtubules are also critical to chromosomal movement during cell division. Both stable 
and short-lived populations of microtubules exist in the cell. 

Microtubules are polymers of GTP-binding tubulin protein subunits. Each subunit is a 

2 0 heterodimer of a- and p- tubulin, multiple isoforms of which exist. The hydrolysis of GTP is linked 
to the addition of tubulin subunits at the end of a microtubule. The subunits interact head to tail to 
form protofilaments; the protofilaments interact side to side to form a microtubule. A microtubule is 
polarized, one end ringed with a-tubulin and the other with P-tubuliu, and the two ends differ in then: 
rates of assembly. Generally, each microtubule is composed of 13 protofilaments although 11 or 15 

25 protofilament-microtubules are sometimes found. Cilia and flagella contain doublet microtubules. 
Microtubules grow fi-om specialized structures known as centrosomes or microtubule-organizing 
colters (MTOCs). MTOCs may contam one or two centrioles, which are pinwheel arrays of triplet 
microtubules. The basal body, the organizmg center located at the base of a cilimn or flagellum, 
contains one centriole. Gamma tubulin present in the MTOC is important for nucleating the 

30 polymerization of a- and P- tubulin heterodimers but does not polymerize into microtubules. 
Microtubule-Associated Proteins 

Microtubule-associated proteins (MAPs) have roles in the assembly and stabilization of 
microtubules. One major family of MAPs, assembly MAPs, can be identified in neurons as well as 
non-neuronal cells. Assembly MAPs are responsible for cross-linkmg microtubules in the cytosol. 

35 These MAPs are organized into two domains: a basic microtubule-binding domain and an acidic 
projection domain. The projection domain is the binding site for membranes, intermediate filaments, 
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but rather associate with microtubules and dynem. 

Actin-Associated Proteins 

Actin-associated proteins have roles in cross-linking, severing, and stabilization of actin 
filaments and in sequestering actin monomers. Several of the actin-associaled proteins have multiple 
functions. Bundles and networks of actin filaments are held together by actin cross-linking proteins. 
These proteins have two actin-binding sites, one for each filament. Short cross-linking proteins 
promote bundle fonnation while longer, more flexible cross-linking proteins promote network 
formation. Cahnodulin-like calcium-binding domams in actin cross-linking proteins allow calcium 
regulation of cross-linking. Group I cross-linkmg proteins have unique actin-bindmg domams and 
include the 30 kD protein, EF-la, fascm. and scruin. Group n cross-linking proteins have a 7,000- 
MW actin-binding domain and include viUin and dematin. Group III cross-linking proteins have 
pairs of a 26,000-MW actm-binding domain and include fimbrin, spectrin, dystrophin, ABP 120, and 
filamin. 

Severing proteins regulate the length of actin filaments by breaking tiiem into short pieces or 
by blocking their ends. Severing proteins include gCAP39, severin (firagmin), gelsolin, and villin. 
Capping proteins can cap flie ends of actin filaments, but cannot break filaments. Capping proteins 
include CapZ and tropomodulin. The proteins thymosin and profilin sequester actin monomers in flie 
cytosol, allowing a pool of unpolymerized actin to exist. The actin-associated proteins tropomyosin, 
troponin, and caldesmon regulate muscle contraction in response to calcium. 
Intermediate Filaments and Associated Proteins 

hitermediate filaments (JFs) are cytoskeletal fibers with a diameter of about 10 nm, 
intennediate between that of microfilaments and microtubules. IPs serve structural roles in the cell, 
reinforcing cells and organizing cells mto tissues. IPs are particularly abundant in epidermal cells 
and in neurons. IPs are extremely stable, and, in contrast to microfilaments and microtubules, do not 
function in cell motility. 

Five types of IF proteins are known in manmials. Type I and Type H proteins are the acidic 
and basic keratins, respectively. Heterodimers of tiie acidic and basic keratins are die buUding blocks 
of keratin TPs. Keratins are abundant m soft epithelia such as skin and cornea, hard epitiielia such as 
nails and hair, and in epitiielia fliat line internal body cavities. Mutations in keratin genes lead to 
epitiielial diseases including epidermolysis bullosa simplex, bullous congenital ichthyosiform 
erytiirodeima (epidermolytic hyperkeratosis), non-epidermolytic and epidermolytic paknoplantar 
keratodenna, ichthyosis bullosa of Siemens, pachyonychia congenita, and white sponge nevus. Some 
of fliese diseases result in severe skin blistering. (See, e.g„ Wawersik, M. et al. (1997) J. Biol. Chem. 
272:32557-32565; and Corden L.D. and W.H. McLean (1996) Exp. Dermatol. 5:297-307.) 

Type in IF proteins include desmin, glial fibrillary acidic protein, vimentin, and peripherin. 
Desmin filaments m muscle cells link myofibrils into bundles and stabilize sarcomeres in contracting 
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rigidity to the epithelium. IPs in epithelial cells are attached to the desmosome by piakoglobin and 

desmoplakins. The proteins that link IPs to hemidesmosomes are not known. Desmin IPs surround 

the sarcomere in muscle and are linked to the plasma membrane by paranemin, synenain, and ankyrin. 

Mvosin-related Motor Proteins 

5 Myosins are actin-activated ATPases, found in eukaryotic cells, that couple hydrolysis of 

ATP with motion. Myosin provides the motor function for muscle contraction and intracellular 

movements such as phagocytosis and rearrangement of cell contents during mitotic cell division 

(cytokinesis). The contractile unit of skeletal muscle, termed the sarcomere, consists of highly 

ordered arrays of thin actin-containing filaments and thick myosin-containing filaments. 

10 Crossbridges form between the thick and thin filaments, and the ATP-dependent movement of myosin 

heads within the thick filaments pulls the thin filaments, shortening the sarcomere and thus the 

muscle fiber. 

Myosins are composed of one or two heavy chains and associated light chains. Myosin 
heavy chains contain an amino-terminal motor or head domain, a neck that is the site of light-chain 

15 binding, and a carboxy-terminal tail domain. The tail domains may associate to form an a-helical 
coiled coil. Conventional myosins, such as those found in muscle tissue, are composed of two 
myosin heavy-chain subunits, each associated with two light-chain subunits that bind at the neck 
region and play a regulatory role. Unconventional myosins, believed to function in intracellular 
motion, may contain either one or two heavy chains and associated light chams. There is evidence 

20 for about 25 myosin heavy chain genes in vertebrates, more than half of them unconventional. 
Dvnein-related Motor Proteins 

Dyneins are (-) end-directed motor proteins which act on microtubules. Two classes of 
dyneins, cytosolic and axonemal, have been identified. Cytosolic dyneins are responsible for 
translocation of materials along cytoplasmic microtubules, for example, transport from the nerve 

25 terminal to the cell body and transport of endocytic vesicles to lysosomes. Cytoplasmic dyneins are 
also reported to play a role in mitosis. Axonemal dyneins are responsible for the beating of flagella 
and cilia. Dynein on one microtubule doublet walks along the adjacent microtubule doublet. This 
sliding force produces bending forces that cause the flagellum or cilium to beat. Dyneins have a 
native mass between 1000 and 2000 kDa and contain either two or three force-producing heads driven 

30 by the hydrolysis of ATP. The heads are linked via stalks to a basal domain which is composed of a 
highly variable number of accessory intermediate and light chains. 
Kinesin-related Motor Proteins 

Kinesins are (+) end-directed motor proteins which act on microtubules. The prototypical 
kinesin molecule is involved in the transport of membrane-bound vesicles and organelles. This 

35 function is particularly important for axonal transport in neurons. Kinesin is also important ia all cell 
types for the transport of vesicles from the Golgi complex to the endoplasmic reticulum. This role is 
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Other protems. 5ome dynamin-ielated proteins do not contain the pleckstrin homology aomain or me 

proline-rich domain. (See McNiven, M.A. (1998) Cell 94:151-154; Scaife, R.M. and R.L. MargoUs 

(1997) CeU. Signal. 9:395-401.) 

The cytoskeleton is reviewed m Lodish, H. et al. (1995) Molecular Cell Biology , Scientific 

5 American Books, New York NY. 

Ribosomal Molecules 

Ribosomal RNAs (rRNAs) are assembled, along with ribosomal proteins, into ribosomes, 

which are cytoplasmic particles that translate messenger RNA into polypeptides. The eukaryotic 

ribosome is composed of a 60S Garge) subunit and a 40S (small) subunit, which together form the 

10 SOS ribosome. In addition to the 18S, 28S, 5S, and 5.8S rRNAs, the ribosome also contains more 
than jBfty proteins. The ribosomal proteins have a prefix which denotes the submiit to which they 
belong, either L (large) or S (small). Ribosomal protein activities include binding rRNA and 
organizing the conformation of the jimctions between rRNA helices (Woodson, S.A. and N.B. 
Leontis (1998) Curr. 0pm. Struct Biol. 8:294-300; Ramakrishnan, V. and S.W. White (1998) Trends 

15 Biochem. Sci. 23:208-212.) Three important sites are identified on the ribosome. The aminoacyl- 
tRNA site (A site) is where charged tRNAs (with the exception of the initiator-tRNA) bind on arrival 
at the ribosome. The peptidyl-tRNA site (P site) is where new peptide bonds are formed, as well as 
where the initiator tRNA binds. The exit site (E site) is where deacylated tRNAs bind prior to their 
release from the ribosome. (The ribosome is reviewed in Stryer, L. (1995) Biochemistry W.H. 

20 Freeman and Company, New York NY, pp. 888-908; and Lodish, H. et al. (1995) Molecular Cell 
Biology Scientific American Books, New York NY. pp. 119-138.) 
Chromatin Molecules 

The nuclear DNA of eukaryotes is organized into chromatin. Two types of chromatin are 
observed: euchromatin, some of which may be transcribed, and heterochromatin so densely packed 

25 that much of it is inaccessible to transcription. Chromatin packing thus serves to regulate protein 
expression in eukaryotes. Bacteria lack chromatin and the chromatin-packing level of gene 
regulation. 

The fundamental unit of chromatin is the nucleosome of 200 DNA base pairs associated with 
two copies each of histones H2A, H2B, H3, and H4. Adjascent nucleosomes are linked by another 
30 class of histones, HI . Low molecular weight non-histone proteins called the high mobility group 
(HMG), associated with chromatin, may function in the unwinding of DNA and stabilization of 
single-stranded DNA. Chromodomain proteins function in compaction of chromatin into its 
transcriptionally silent heterochromatin form. 

During mitosis, all DNA is compacted mto heterochromatin and transcription ceases. 
35 Transcription in interphase begins with the activation of a region of chromatin. Active chromatin is 
decondensed. Decondensation appears to be accompanied by changes in binding coefficient. 
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production in cells. 

Mitochondria contain a small amount of DNA. Human mitochondrial DNA encodes 13 
proteins, 22 tRNAs, and 2 rRNAs. Mitochondrial-DNA encoded proteins include NADH-Q 
reductase, a cytochrome reductase subunit, cytochrome oxidase subunits, and ATP synthase subunits. 
5 Electron-transfer reactions also occur outside the mitochondria in locations such as the 

endoplasmic reticulum, which plays a crucial role in lipid and protein biosynthesis. Cytochrome b5 
is a central electron donor for various reductive reactions occurring on the cytoplasmic surface of 
liver endoplasmic reticulum. Cytochrome b5 has been found in Golgi, plasma, endoplasmic 
reticulum (BR), and microbody membranes. 

10 For a review of mitochondrial metabolism and regulation, see Lodish, H. et al. (1995) 

Molecular Cell Biology . Scientific American Books, New York NY, pp. 745-797 and Stryer (1995) 
Biochemistrv , W.H. Freeman and Co., San Francisco CA, pp 529-558. 988-989. 

The majority of mitochondrial proteins are encoded by nuclear genes, are synthesized on 
cytosolic ribosomes, and are inserted into the mitochondria. Nuclear-encoded proteins which are 

15 destmed for the mitochondrial matrix typically contain positively-charged amino terminal signal 
sequences. Import of these preproteins from the cytoplasm requires a multisubunit protein complex 
in the outer membrane known as the translocase of outer mitochondrial membrane (TOM; previously 
designated MOM; Pfanner, N. et al. (1996) Trends Biochem. Sci. 21:51-52) and at least three inner 
membrane proteins which comprise the translocase of inner mitochondrial membrane (TIM; 

20 previously designated MIM; Pfanner, supra) . An mside-negative membrane potential across the 
inner mitochondrial membrane is also required for preprotein import. Preproteins are recognized by 
surface receptor components of the TOM complex and are translocated through a proteinaceous pore 
formed by other TOM components. Proteins targeted to the matrix are then recognized by the import 
machinery of the TIM complex. The import systems of the outer and inner membranes can function 

25 mdependendy (Segui-Real, B. et al. (1993) EMBO J. 12:221 1-2218). 

Once precursor proteins are in the mitochondria, the leader peptide is cleaved by a signal 
peptidase to generate the mature protein. Most leader peptides are removed in a one step process by a 
protease termed mitochondrial processing peptidase (MPP) (Paces, V. et al. (1993) Proc. Nati. Acad. 
Sci. USA 90:5355-5358). In some cases a two-step process occurs in which MPP generates an 

3 0 intermediate precursor form which is cleaved by a second enzyme, mitochondrial intermediate 
peptidase, to generate the mature protein. 

Mitochondrial dysfunction leads to unpaired calcium buffering, generation of free radicals 
that may participate in deleterious intracellular and extracellular processes, changes in mitochondrial 
permeability and oxidative damage which is observed in several neurodegenerative diseases. 

3 5 Neurodegenerative diseases linked to mitochondrial dysfunction include some forms of Alzheimer* s 
disease, Friedreich's ataxia, familial amyotrophic lateral sclerosis, and Huntington's disease (Beal, 
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Biophys. Acta 1366:151-165). 
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a helices or B sheets that bind to the major groove of DNA. Four well-characterized structural motifs 

are helix-tum-helix, zinc finger, leucine zipper, and helix-loop-helix. Protems containing these 

motifs may act alone as monomers, or they may form homo- or heterodimers that interact with DNA. 

The helix-tura-helix motif consists of two a helices connected at a fixed angle by a short 

5 chain of amino acids. One of the helices binds to the major groove. Helix-tum-helix motifs are 

exemplified by the homeobox motif which is present ia homeodomain proteins. These proteins are 

critical for specifymg the anterior-posterior body axis during development and are consCTved 

throughout the animal kingdom. The Antennapedia and Ultrabithorax proteins of Drosophila 

melanogaster are prototypical homeodomain proteins (Pabo, CO. and R.T. Sauer (1992) Annu. Rev. 

10 Biochem. 61:1053-1095). 

The zinc finger motif, which binds zinc ions, generally contains tandem repeats of about 30 
amiao acids consisting of periodically spaced cysteine and histidine residues. Examples of this 
sequence pattern, designated C2H2 and C3HC4 ("RING" finger), have been described (Lewin. 
supra) . Zinc finger proteins each contaia an a helix and an antiparallel 6 sheet whose proximity and 

15 conformation are maintained by the zinc ion. Contact with DNA is made by the arginine prece ding 
the a helix and by the second, third, and sixth residues of the a helix. Variants of the zmc finger 
motif include poorly defined cysteine-rich motifs which bind zinc or other metal ions. These motifs 
may not contain histidine residues and are generally nonrepetitive. 

The leucine zipper motif comprises a stretch of amino acids rich in leucine which can fonn 

20 an amphipathic a helix. This structure provides the basis for dhnerization of two leucine zipper 

proteins. The region adjacent to the leucine zipper is usually basic, and upon protein dimerization, is 
optimally positioned for binding to the major groove. Proteins containing such motifs are generally 
referred to as bZEP transcription factors. 

The helix-loop-helix motif (HLH) consists of a short a helix connected by a loop to a longer 

25 a helix. The loop is flexible and allows the two helices to fold back against each other and to bind to 
DNA. The transcription factor Myc contains a prototypical HLH motif. 

Most transcription factors contain characteristic DNA binding motifs, and variations on the 
above motifs and new motifs have been and are currently being characterized (Faisst, S. and S. Meyer 
(1992) Nucleic Acids Res. 20:3-26). 

3 0 Many neoplastic disorders in humans can be attributed to uiappropriate gene expression. 

Malignant cell growth may result firom either excessive expression of tumor promoting genes or 
insufficient expression of tumor suppressor genes (Cleary, M.L. (1992) Cancer Surv. 15:89-104). 
Chromosomal translocations may also produce chimeric loci which fiise the coding sequence of one 
gene with the regulatory regions of a second umelated gene. Such an arrangement likely results in 

3 5 mappropriate gene transcription, potentially contributing to malignancy. 
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Many membrane proteins contain amino acid sequ^ce motifs that target these proteins to 

specific subcellular sites. Examples of these motifs include PDZ domains, KDEL, RGD, NGR, and 

GSL sequence motifs, von WiUebrand factor A (vWFA) domaios, and EGF-like domains. RGD, 

NGR, and GSL motif-containing peptides have been used as drug delivery agents in targeted cancer 

5 treatment of tumor vasculature (Arap, W. et al. (1998) Science 279:377-380). Furthermore, 

membrane proteins may also contain amino acid sequence motifs, such as the carbohydrate 

recognition domain (CRD), that mediate interactions with extracellular or intracellular molecules. 

TefraRpan Famil y Proteins 

The transmembrane 4 sxjperfamily (TM4SF) or tetraspan family is a multigene fanndly 
10 encoding type EI integral membrane proteins (Wright, M.D. and M.G. Tomlinson (1994) Immunol. 
Today 15:588-594). The TM4SF is conq)osed of membrane proteins which traverse the cell 
membrane four times. Members of the TM4SF include platelet and endothelial cell membrane 
proteins, melanoma-associated antigens, leukocyte surface glycoproteins, colonal carcinoma 
antigens, tumor-associated antigens, and surface proteins of the schistosome parasites (Jankowski, 
15 S.A. (1994) Oncogene 9:1205-1211). Members of the TM4SF share about 25-30% ammo acid . 
sequence identity with one another. 

A number of TM4SF members have been implicated in signal transduction, control of cell 
adhesion, regulation of cell growth and proliferation, including development and oncogenesis, and 
cell motility, including tumor cell metastasis. Expression of TM4SF proteins is associated with a 

2 0 variety of tumors and the level of expression may be altered when cells are growing or activated. 

Tumor Antigens 

Tumor antigens are cell surface molecules that are differentially expressed in tumor cells 
relative to normal cells. Tumor antigens distinguish tumor cells immunologically from normal cells 
and provide diagnostic and therapeutic targets for human cancers (Takagi, S. et al. (1995) Int J. 
25 Cancer 61:706^715; Liu, E. et al. (1992) Oncogene 7: 1027-1032). 
Leukocvte Antigens 

Other types of cell surface antigens include those identified on leukocytic cells of the 
immune system. These antigens have been identified using systematic, monoclonal antibody (mAb)- 
based "shot gun" techniques. These techniques have resulted in the production of hundreds of mAbs 
30 directed against unknown cell surface leukocytic antigens. These antigens have been grouped into 
"clusters of differentiation" based on common inamunocytochemical localization patterns in various 
differentiated and undifferentiated leukocytic cell types. Antigens in a given cluster are presumed to 
identify a single ceD surface protein and are assigned a "cluster of differentiation" or "CD" 
designation. Some of the genes encoding proteins identified by CD antigens have been cloned and 

3 5 verified by standard molecular biology techniques. CD antigens have been characterized as both 

transmembrane proteins and cell surface proteins anchored to the plasma membrane via covalent 
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Hematol. 3:19-26). 

Peripheral and Anchored Membrane Proteins 

Some membrane proteins are not membrane-spanning but are attached to the plasma 
membrane via membrane anchors or interactions with integral membrane proteins. Membrane 
5 anchors are covalently joined to a protein post-translationally and include such moieties as prenyl, 
myristyl, and glycosylphosphatidyl inositol groups. Membrane localization of peripheral and 
anchored proteins is important for their function in processes such as receptor-mediated signal 
transduction. For example, preaylation of Ras is required for its localization to the plasma membrane 
and for its normal and oncogenic functions in signal transduction. 

10 Vesicle Coat Proteins 

Intercellular communication is essential for the development and survival of multicellular 
organisms. Cells communicate with one another through the secretion and uptake of protein 
signaling molecules. The uptake of proteins into the cell is achieved by the endocytic pathway, in 
which the interaction of extracellular signaling molecules with plasma membrane receptors results in 

15 the formation of plasma membrane-derived vesicles that enclose and transport the molecules into the 
cytosol. These transport vesicles fuse with and mature into endosomal and lysosomal (digestive) 
compartments. The secretion of proteins from the cell is achieved by exocy tosis, in which molecules 
inside of the cell proceed through the secretory pathway. In this pathway, molecules transit from the 
ER to the Golgi apparatus and finally to the plasma membrane, where they are secreted from the cell. 

2 0 Several steps in the transit of material along the secretory and endocytic patiiway s require the 

formation of transport vesicles. Specifically, vesicles form at the transitional endoplasmic reticulum 
(tER), the rim of Golgi cistemae, tiie face of the Trans-Golgi Network (TON), the plasma membrane 
(PM), and tubular extensions of the endosomes. Vesicle formation occurs when a region of 
membrane buds off fixjm the donor organelle. The membrane-bound vesicle contains proteins to be 

25 transported and is surrounded by a protemaceous coat, the components of which are recruited from 
the cytosol. Two different classes of coat protein have been identified. Clathrin coats form on 
vesicles derived from the TON and PM, whereas coatomer (COP) coats form on vesicles derived 
from the ER and Golgi. COP coats can be further classified as COPI, involved in retrograde traffic 
through die Golgi and from the Golgi to the ER, and COPE, involved in anterograde traffic frona the 

30 ER to the (jolgi (MeUman, supra) . 

In clathrin-based vesicle formation, adapter proteins bring vesicle cargo and coat proteins 
together at the surface of the budding membrane. Adapter protein-1 and -2 select cargo from the 
TGN and plasma membrane, respectively, based on molecular information encoded on the 
cytoplasmic tail of integral membrane cargo proteins. Adapter proteins also recruit clathrin to the 

3 5 bud site. Qathrin is a protein complex consisting of three large and three small polypeptide chains 

arranged in a three-legged structure called a triskelion. Multiple triskelions and other coat proteins 
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insufficient expression of tumor suppressor genes (Qeaiy, M.L. (1992) Cancer Surv. 15:89-104). 

Chromosomal translocations may also produce chimeric loci which fuse the coding sequence of one 

gene wifli the regulatory regions of a second unrelated gene. Such an arrangement likely results in 

inappropriate gene transcription, potentially contributing to malignancy. 

5 In addition, the immune system responds to infection or trauma by activating a cascade of 

events that coordinate the progressive selection, amplification, and mobilization of cellular defense 

mechanisms. A complex and balanced program of gene activation and repression is involved in this 

process. However, hyperactivity of flie inmiune system as a result of improper or insufficient 

regulation of gene expression may result in considerable tissue or organ damage. This damage is 

10 well documented in immunological responses associated with arthritis, allergens, heart attack, stroke, 

and infections (Isselbacher, K.J. et al. (1996) Harrison's Principles of Memal Medicine , 13/e, 

McGraw Hill, Inc. and Teton Data Systems Software). 

Nucleolus 

The nucleolus is a highly organized subcompartment in the nucleus that contains high 
15 concentrations of RNA and proteins and functions mainly in ribosomal RNA synthesis and assembly 
(Alberts, et al. supra, pp. 379-382). Ribosomal RNA (rRNA) is a structural RNA tiiat is complexed 
with proteins to form ribonucleoprotein structures called ribosomes. Ribosomes provide the 
platform on which protein synthesis takes place. 

Ribosomes are assembled in the nucleolus initially from a large, 45S combined with a variety 
20 of proteins imported from the cytoplasm, as well as smaller, 5S rRNAs. Later processing of the 

immature ribosome results in formation of smaller ribosomal subunits which are transported from the 
nucleolus to the cytoplasm where they are assembled into functional ribosomes. 
jBndoplasmic Reticulum 

Li eukaryotes, proteins are synthesized within the endoplasmic reticulum (ER), delivered 

2 5 from the ER to the Golgi apparatus for post-translational processing and sorting, and transported from 

the Golgi to specific intracellular and extracellular destinations. Synthesis of integral membrane 
proteins, secreted protems, and proteins destined for the lumen of a particular organelle occurs on the 
rough endoplasmic reticulum (ER). The rough ER is so named because of the rough appearance in 
electron micrographs imparted by the attached ribosomes on which protein synthesis proceeds. 
30 Synthesis of proteins destined for the ER actually begins in the cytosol with the syntiiesis of a 
specific signal peptide which directs the growing polypeptide and its attached ribosome to the ER 
membrane where the signal peptide is removed and protein synthesis is completed. Soluble proteins 
destined for the ER lumen, for secretion, or for transport to the lumen of other organelles pass 
conq)letely into the ER lumen. Transmembrane proteins destined for the ER or for other cell 

3 5 membranes are translocated across the ER membrane but remain anchored in the lipid bilay er of the 

membrane by one or more membrane-spanning a-helical regions. 
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procems ana secreuon of honnones, neurotransmitters, digestive enzymes, wastes, etc. 

A common property of all of these vacuoles is an acidic pH environment ranging from 

approximately pH 4.5-5.0. This acidity is maiutaiaed by the presence of a proton ATPase that uses 

the energy of ATP hydrolysis to generate an electrochemical proton gradient across a membrane 

5 (Mellman, L et al. (1986) Annu. Rev. Biochem. 55:663-700). Eukaiyotic vacuolar proton ATPase 

(vp-ATPase) is a multimeric enzyme composed of 3-10 different subunits. One of these subunits is a 

highly hydrophobic polypeptide of approximately 16 kDa that is similar to the proteolipid component 

of vp-ATPases from eubacteria, fimgi, and plant vacuoles (Mandel, M. et aL (1988) Proc. Natl. Acad. 

Sci. USA 85:5521-5524). The 16 kDa proteolipid component is the major subunit of the membrane 

10 portion of vp-ATPase and functions in the transport of protons across the menibrane. 

Lvsosomes 

Lysosomes are membranous vesicles containing various hydrolytic enzymes used for the 
controlled iatracellular digestion of macromolecules. Lysosomes contain some 40 types of enzymes 
including proteases, nucleases, glycosidases, lipases, phospholipases, phosphatases, and sulfatases, all 

15 of which are acid hydrolases that function at a pH of about 5. Lysosomes are surroxmded by a upique 
membrane containing transport proteins that allow the final products of macromolecule degradation, 
such as sugars, amino acids, and nucleotides, to be transported to the cytosol where they may be 
either excreted or reutilized by the cell. A vp-ATPase, such as that described above, mamtains the 
acidic environment necessary for hydrolytic activity (Alberts, supra, pp. 610-61 1). 

20- Endosomes 

Endosomes are another type of acidic vacuole that is used to transport substances from the 
cell surface to the interior of the cell in the process of endocytosis. Like lysosomes, endosomes have 
an acidic environment provided by a vp-ATPase (Alberts et al. supra, pp. 610-618). Two types of 
endosomes are apparent based on tracer uptake studies that distinguish their time of formation in the 

25 cell and their cellular location. Early endosomes are found near the plasma membrane and appear to 
function primarily in the recycling of internalized receptors back to the cell surface. Late endosomes 
appear later in the endocytic process close to the Golgi apparatus and the nucleus, and appear to be 
associated with delivery of endocytosed material to lysosomes or to the TGN where they may be 
recycled. Specific proteins are associated with particular transport vesicles and their target 

30 compartments that may provide selectivity in targeting vesicles to their proper compartments. A 
cytosolic prenylated GTP-binding protein, Rab, is one such protein. Rabs 4. 5, and 11 are associated 
with the early endosome, whereas Rabs 7 and 9 associate with the late endosome. 
Mitochondria 

Mitochondria are oval-shaped organelles comprising an outer membrane, a tightly folded 
3 5 inner membrane, an intemiembrane space between the outer and inner membranes, and a matrix 
inside the inner membrane. The outer membrane contains many porin molecules that allow ions and 
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when the genes are part of the same signaling pathway. In other cases, such as when the genes 

participate in separate signaling pathways, the interactions may be totally unexpected. Therefore, 

DNA-based airays can be used to investigate how genetic predisposition, disease, or therapeutic 

treatment affects the expression of a large number of genes. 

5 The discovery of new human molecules satisfies a need in the art by providing new 

compositions which are useful in the diagnosis, study, prevention, and treatment of diseases 

associated with, as well as effects of exogenous compounds on, the expression of human molecules. 



SUMMARY OF THE INVENTION 

10 The present invention relates to nucleic acid sequences conq)rising human diagnostic and 

therapeutic polynucleotides (dithp) as presented in the Sequence Listing. The dithp uniquely identify 
genes encodmg human structural, functional, and regulatory molecules. 

The invention provides an isolated polynucleotide selected from the group consisting of a) a 
polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID 

15 NO: 1-2722; b) a polynucleotide conqprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1- 
2722; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide 
complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). In one 
alternative, the polynucleotide comprises a polynucleotide sequence selected from the group 

20 consisting of SEQ ID NO: 1-2722. In another altemative, the polynucleotide conq)rises at least 30 
contiguous nucleotides of a polynucleotide selected from the group consisting of a) a polynucleotide 
comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-2722; b) a 
polynucleotide comprising a naturally occurring polynucleotide comprising a polynucleotide 
sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of 

25 SEQ ID NO: 1-2722; c) a polynucleotide complementary to the polynucleotide of a); d) a 

ipolynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through 
d). In another altemative, the polynucleotide comprises at least 60 contiguous nucleotides of a 
polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide 
sequence selected from the group consisting of SEQ ID NO: 1-2722; b) a polynucleotide conq)rising a 

30 naturally occurring polynucleotide comprising a polynucleotide sequence at least 90% identical to a 
polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-2722; c) a 
polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the 
polynucleotide of b); and e) an RNA equivalent of a) through d). The invention further provides a 
composition for the detection of expression of human diagnostic and therapeutic polynucleotides 

35 comprising at least one isolated polynucleotide comprising a polynucleotide selected from the group 
consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group 
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90% identical to a polynucleotide sequence selected from ttie groiq) consisting ot SEQ ID NO: 1- 

2722; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide 

complementary to the polynucleotide of b) ; and e) an RNA equivalent of a) through d). In one 

alternative, the invention provides a cell transformed with the recombinant polynucleotide. In 

5 another alternative, the invention provides a transgenic organism comprising the recombinant 

polynucleotide. 

The invention also provides a method for producing a human diagnostic and therapeutic 
polypeptide, the method conq>rising a) culturing a cell under conditions suitable for expression of the 
human diagnostic and ther^>eutic polypeptide, wherein said cell is transformed with a recombinant 

10 polynucleotide, said recombinant polynucleotide comprising an isolated polynucleotide selected from 
the group consistiag of i) a polynucleotide comprising a polynucleotide sequence selected from the 
group consisting of SEQ ID NO: 1-2722; ii) a polynucleotide con^rising a naturally occurring 
polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group 
consisting of SEQ ID NO: 1-2722; iii) a polynucleotide complementary to the polynucleotide of i); iv) 

15 a polynucleotide complementary to the polynucleotide of ii) ; and v) an RNA equivalent of i) through 
iv), and b) recovering the human diagnostic and therapeutic polypeptide so expressed. The invention 
additionally provides a method wherein the polypeptide has an amino acid sequence selected from the 
group consisting of SEQ ID NO:2723-5444. 

The invention also provides an isolated human diagnostic and therapeutic polypeptide 

20 (DITHP) encoded by at least one polynucleotide comprising a polynucleotide sequence selected from 
. the group consisting of SEQ ID NO: 1-2722. The invention further provides a method of screening for 
a test compound that specifically binds to the polypeptide having an amino acid sequence selected 
from the group consisting of SEQ ID NO:2723-5444. The method comprises a) combining the 
polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:2723- 

25 5444 with at least one test compound under suitable conditions, and b) detecting binding of the 

polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2723- 
5444 to the test compound, thereby identifying a compound that specifically binds to the polypeptide 
having an amino acid sequence selected from the group consisting of SEQ ID NO:2723-5444. 

The invention further provides a microarray wherein at least one element of the microarray is 

30 an isolated polynucleotide comprising at least 30 contiguous nucleotides of a polynucleotide selected 
from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from 
the group consisting of SEQ ID NO: 1-2722; b) a polynucleotide comprising a naturally occurring 
polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group 
consisting of SEQ ID NO: 1-2722; c) a polynucleotide complementary to the polynucleotide of a); d) 

35 a polynucleotide complementary to the polynucleotide of b) ; and e) an RNA equivalent of a) through 
d). The invention also provides a method for generating a transcript image of a sample which 
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hybridization coiDplex in the treated biological sample is indicative of toxicity of the test coiiq)ound. 

The invention further provides an isolated polypeptide selected from the group consisting of 

a) a polypeptide comprising an amino acid sequence selected from the group consisdng of SEQ ID 

NO:2723-5444, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% 

5 identical to an amino acid sequence selected from the group consisting of SEQ ID NO:2723-5444, c) 

a biologically active fragment of a polypeptide having an amino acid sequence selected from the 

group consisting of SEQ ID NO:2723-5444, and d) an immunogenic fragment of a polypeptide 

having an amino acid sequence selected from the group consisting of SEQ ID NO:2723-5444. In one 

alternative, the mvention provides an isolated polypeptide comprising an amino acid sequence 

10 selected from the group consisting of SEQ ID NO:2723-5444. 

The invention further provides an isolated polynucleotide encoding a polypeptide selected 
from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the 
group consisting of SEQ ID NO:2723-5444, b) a polypeptide comprising a naturally occurring amino 
acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of 

15 SEQ ID NO:2723-5444, c) a biologically active fragment of a polypeptide having an amino acid 
sequence selected from the group consisting of SEQ ID NO:2723-5444, and d) an immunogenic 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO:2723-5444. In one alternative, the polynucleotide encodes a polypeptide comprising an amino 
acid sequence selected from the group consisting of SEQ ID NO:2723-5444. Jn another alternative, 

20 the polynucleotide comprises a polynucleotide sequence selected from the group consisting of SEQ 
ID NO: 1-2722. 

Additionally, the invention provides an isolated antibody which specifically binds to a 
polypeptide selected from the group consisting of a) a polypeptide conq>rising an amino acid 
sequence selected from the group consisting of SEQ ID NO:2723-5444, b) a polypeptide comprising a 

25 naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 
from the group consisting of SEQ ID NO:2723-5444, c) a biologically active fragment of a 
polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:2723- 
5444, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from 
the group consisting of SEQ ID NO:2723-5444. 

30 The invention further provides a composition comprising a polypeptide selected from the 

group consisting of a) a polypeptide comprising an amino acid sequence selected from the group 
consisting of SEQ ID NO:2723-5444, b) a polypeptide comprising a naturally occurrmg amino acid 
sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ 
ID NO:2723-5444, c) a biologically active fragment of a polypeptide having an amino acid sequence 

35 selected from the group consisting of SEQ ID NO:2723-5444, and d) an immunogenic fragment of a 
polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:2723- 
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fragment of a polypeptide having an amino acid sequence selected from flie group consisting of SEQ 

K) NO:2723-S444» and d) an immunogenic fragment of a polypeptide having an amino acid sequence 

selected from the group consisting of SEQ ID NO:2723-5444. The method comprises a) combining 

the polypeptide with at least one test compound under conditions permissive for the activity of the 

polypeptide, b) assessing the activity of the polypeptide in the presence of the test con;)Ound, and c) 

comparing the activity of the polypeptide in the presence of the test compound with the activity of the 

polypeptide in the absence of the test compound, wherein a change in the activity of the polypeptide 

in the presence of the test compound is indicative of a compound that modulates the activity of the 

polypeptide. 



DESCRIPTION OF THE COMPACT DISC-RECORDABLE (CD-R) 
CD-R 1 is identified by: PN-OlOO POT, Copy 1- SEQUENCE LISTING PART, recorded on 
09/11/03 and contains the Sequence Listing formatted in plain ASCII text The file for ttie Sequence 
Listmg is entitted PN-0100.seq.listmg.txt, was recorded on 09/11/03 and is 22.079 KB m size. CD-R 
15 2 is an exact copy of CD-R-1. CD-R 2 is identified by PN-0100 PCT, Copy 2 - SEQUENCE 

LISTING PART, recorded on 09/1 1/03. CD-R 3 is an exact copy of CD-R-1. CD-R 3 is identified by 
PN-0100 PCT, Copy 3 - SEQUENCE LISTING PART, recorded on 09/1 1/03. CD-R 4 is an exact 
copy of CD-R-1. CD-R 4 is identified by PN-0100 PCT, CRF, recorded on 09/1 1/03. 

The contents of the Sequence Listing named above, which is being submitted on four (4) 
20 compact discs, is incorporated by reference herein in its entirety. 

CD-R 5 is identified by: PN-0100 PCT, Copy 1- TABLES PART, recorded on 09/12/03 and 
contains: Tables 1, 2, 3, 4. and 5 formatted in plain ASCII text (tab delimited). The file for Table 1 
is entitled pn0100tl.txt, was recorded on 09/12/03 and is 1 12 KB in size. The file for Table 2 is 
entitled pn0100t2.txt, was recorded on 09/12^03 and is 183 KB in size. The file for Table 3 is entitied 
25 pn0100t3.txt, was recorded on 09/12/03 and is 732 KB in size. The file for Table 4 is entitled 
pn0100t4.txt, was recorded on 09/12/03 and is 565 KB in size. The file for Table 5 is entitled 
pn0100t5.txt, was recorded on 09/12/03 and is 173 KB in size. 

CD-R 6 is an exact copy of CD-R 5. CD-R 6 is idenUfied by: PN-OlOO PCT, Copy 2 - 
TABLES PART, recorded on 09/12/03. CD-R 7 is an exact copy of CD-R 5. CD-R 7 is identified 
30 by: PN-0100 PCT, Copy 3 - TABLES PART, recorded on 09/12/03. 

The contents of each of the tables named above, which are being submitted on three (3) 
compact discs, the contents as described below, are incorporated by reference herein in their entirety. 

DESCRIPTION OF THE TABLES 
35 Table 1 shows the sequence identification numbers (SEQ ID NO:s) and Incyte identification 

numbers (Incyte ID No.) corresponding to the polynucleotides of the present invention, along with the 
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describing and disclosing the cell lines, vectors, and methodologies which are presented and which 

might be used in connection with the invention. Nothing in the specification is to be construed as an 

admission that the invention is not entitied to antedate such disclosure by virtue of prior invention. 

Definitions 

5 As used herein, the lower case "dithp" refers to a nucleic acid sequence, while the upper case 

'T)ITHP" refers to an amino acid sequence encoded by dithp. A "fidl-length" dithp refers to a nucleic 
acid sequence containing the entire coding region of a gene endogenously expressed in human tissue. 

''Adjuvants" are materials such as Freund's adjuvant, mineral gels (aluminum hydroxide), and 
surface active substances (lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole 
10 limpet hemocyanin, and dinitrophenol) which may be administered to increase a host's 
immunological response. 

'"Allele" refers to an alternative form of a nucleic acid sequence. Alleles result from a 
"mutation," a change or an alternative reading of the genetic code. Any given gene may have none, 
one, or many allelic forms. Mutations which give rise to alleles include deletions, additions, or 
15 substitutions of nucleotides. Each of these changes may occur alone, or in combination with the 
others, one or more tunes in a given nucleic acid sequence. The present invention encompasses 
allelic dithp. 

An "allelic variant" is an alternative form of the gene encoding DITHP. Allelic variants may 
result from at least one mutation in the nucleic acid sequence and may result in altered mRNAs or in 
20 polypeptides whose structure or function may or may not be altered. A gene may have none, one, or 
many allelic variants of its naturally occurring form. Common mutational changes which give rise to 
allelic variants are generally ascribed to natural deletions, additions, or substitutions of nucleotides. 
Each of these types of changes may occur alone, or in combination with the others, one or more times 
in a given sequence. 

25 "Altered" nucleic acid sequences encoding DITHP include those sequences with deletions, 

insertions, or substitutions of different nucleotides, resulting in a polypeptide the same as DITHP or a 
polypeptide with at least one functional characteristic of DITHP. Included within this definition are 
polymorphisms which may or may not be readily detectable using a particular oligonucleotide probe 
of the polynucleotide encoding DITHP, and improper or unexpected hybridization to allelic variants, 

30 with a locus other than the normal chromosomal locus for the polynucleotide sequence encoding 
DITHP. The encoded protein may also be "altered," and may contain deletions, insertions, or 
substitutions of amino acid residues which produce a silent change and result in a functionally 
equivalent DITHP. Deliberate amino acid substitutions may be made on the basis of similarity in 
polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the 

35 residues, as long as the biological or immunological activity of DITHP is retained. For example, 
negatively charged amino acids may include aspartic acid and glutamic acid, and positively charged 
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substrates containing right-handed nucleotides. 

"Antisense sequence" refers to a sequence capable of specifically hybridizing to a target 
sequence. The antisense sequence may include DNA, RNA, or any nucleic acid nmnic or analog 
such as peptide nucleic acid (PNA); oligonucleotides having modified backbone linkages such as 
phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides ha\dng modified 
sugar groups such as 2 -methoxyethyl sugars or 2'-methoxyethoxy sugars; or oligonucleotides having 
modified bases such as 5-methyl cytosine, 2'Kieoxyuracil, or 7-deaza-2'-deoxyguanosine, 

"Antisense technology" refers to any technology which relies on the specific hybridization of 
an antisense sequence to a target sequence. 

A "bm" is a portion of conq)uter memory space used by a computer program for storage of 
data, and bounded in such a manner that data stored in a bin may be retrieved by the program. 

'•Biologically active" refers to an amino acid sequence having a structural, regulatory, or 
biochemical function of a naturally occurring amino acid sequence. 

"Canonical splice site" refers to the polynucleotide GTAG located on the positive strand of 

DNA. 

"Clone joining" is a process for combming gene bins based upon the bins' containing 
sequence information firom the same clone. The sequences may assemble into a primary gene 
transcript as well as one or more splice variants. 

"Complementary" describes the relationship between two single-stranded nucleic acid 
sequences that amieal by base-pairing (5'-A-G-T-3' pairs with its complement 3 -T-C-A-5'). 

A "component sequence" is a nucleic acid sequence selected by a cocD^uter program such as 
PHRED and used to assemble a consensus or template sequence from one or more component 
sequences. 

A "consensus sequence" or "template sequence" is a nucleic acid sequence which has been 
assembled from overlapping sequences, using a computer program for fragment assembly such as the 
GEL VIEW fragment assembly system (Genetics Computer Group (GCG), Madison WT) or using a 
relational database management system (RDMS). 

"Conservative amino acid substitutions" are those substitutions that, when made, least 
interfere with the properties of die origmal protem, i.e., the structure and especially the function of 
the protein is conserved and not significantly changed by such substitutions. The table below shows 
amino acids which may be substituted for an original amino acid m a protein and which are regarded 
as conservative substitutions. 



Original Residue Conservative Substitution 

Ala Gly, Ser 

Arg His, Lys 

Asn Asp, Ghi, His 
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'TB-value" refers to the statistical probability that a match between two sequences occurred by 

chance. 

••Exon shuffling" refers to the recombination of different coding regions (exons). Since an 
exon may represent a structural or functional domain of the encoded protein, new proteins may be 
5 assembled through the novel reassortment of stable substructures, thus allowing acceleration of the 
evolution of new protein functions. 

A '*firagment" is a unique portion of dithp or DITHP which can be identical in sequence to but 
shorter in length than the parent sequence. A fragment may comprise up to the enture length of the 
deiSned sequence, minus one nucleotide/amino acid residue. For example, a fragment may comprise 

10 from 10 to 1000 contiguous amino acid residues or nucleotides. A fragment used as a probe, primer, 
antigen, therapeutic molecule, or for other purposes, may be at least 5, 10, 15, 16, 20, 25, 30, 40, 50, 
60, 75, 100, 150, 250 or at least 500 contiguous amino acid residues or nucleotides in length. 
Fragments may be preferentially selected from certain regions of a molecule. For example, a 
polypeptide fragment may comprise a certain length of contiguous amino acids selected from the first 

15 250 or 500 anoino acids (or first 25% or 50%) of a polypeptide as shown in a certain defined 

sequence. Clearly these lengths are exemplary, and any length that is supported by the specification, 
including the Sequence Listing and the figures, may be encompassed by the present embodiments. 

A fragment of dithp con5)rises a region of unique polynucleotide sequence that specifically 
identifies dithp, for example, as distinct from any other sequence in the same genome. A fragment of 

20 dithp is useful, for example, in hybridization and amplification technologies and in analogous 
methods that distinguish dithp from related polynucleotide sequences. The precise length of a 
fragment of dithp and the region of dithp to which the fragment corresponds are routinely 
determinable by one of ordinary skill in the art based on the intended purpose for the fragment. 

A fragnaent of DITHP is encoded by a fragment of dithp. A fragment of DITHP comprises a 

2 5 region of unique amino acid sequence that specifically identifies DITHP. For example, a fragment of 
DITHP is useful as an immunogenic peptide for the development of antibodies that specifically 
recognize DITHP. The precise length of a fragment of DITHP and the region of DITHP to which the 
fragment corresponds are routinely determinable by one of ordinary skill in the art based on the 
intended purpose for the fragment. 

30 A "full length" nucleotide sequence is one containing at least a start site for translation to a 

protein sequence, followed by an open reading frame and a stop site, and encoding a '*full length" 
polypeptide. 

**Hit" refers to a sequence whose annotation will be used to describe a given template. 
Criteria for selecting the top hit are as follows: if the template has one or more exact nucleic acid 
35 matches, the top hit is the exact match with highest percent identity. If the template has no exact 
matches but has significant protein hits, the top hit is die protein hit with the lowest E-value. If the 
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hybridizatioiis. Appropriate hybridization conditions are routinely determinable by one of ordinary 

skill in the art. 

'Tmnmnologically active" or "imnmnogenic" describes the potential for a natural, 
recombinant, or synthetic peptide, epitope, polypeptide, or protein to induce antibody production in 
5 appropriate animals, cells, or cell lines. 

"immune response" can refer to conditions associated with inflammation, trauma, immune 
disorders, or infectious or genetic disease, etc. These conditions can be characterized by expression 
of various factors, e-g., cytokines, chemokines, and other signaling molecules, which may affect 
cellular and systemic defense systems. 
10 An "immunogenic fragment" is a polypeptide or oligopeptide firagment of DITHP which is 

capable of eliciting an immune response when introduced into a living organism, for exanq>le, a 
mammal. The tenn "immunogenic fragment" also includes any polypeptide or oligopeptide fragment 
of DITHP which can be useful in any of the antibody production methods disclosed herein or known 
in the art. 

15 "Insertion" or "addition" refers to a change in either a nucleic or amino acid sequence in 

which at least one nucleotide or residue, respectively, is added to the sequence. 

"Labeling" refers to the covalent or noncovalent joining of a polynucleotide, polypeptide, or 
antibody with a reporter molecule capable of producing a detectable or measurable signal. 

"Microarray" is any arrangement of nucleic acids, amino acids, antibodies, etc., on a 
20 substrate. The substrate may be a solid support such as beads, glass, paper, nitrocellulose, nylon, or 
an appropriate membrane. 

"Linkers" are short stretches of nucleotide sequence which may be added to a vector or a 
dithp to create restriction endonuclease sites to facilitate cloning. "Polylinkers" are engineered to 
incorporate multiple restriction enzyme sites and to provide for the use of enzymes which leave 5' or 
25 3' overhangs (e.g., BanaHI, EcoRI, and Hindlll) and those which provide blunt ends (e.g., EcoRV, 
SnaBI, and StuI). 

"Naturally occurring" refers to an endogenous polynucleotide or polypeptide that may be 
isolated from viruses or prokaryotic or eukaryotic cells. 

"Nucleic acid sequence" refers to the specific order of nucleotides joined by phosphodiester 
30 bonds in a linear, polymeric arrangement. Depending on the nmnber of nucleotides, the nucleic acid 
sequence can be considered an oligomer, oligonucleotide, or polynucleotide. The nucleic acid can be 
DNA, RNA, or any nucleic acid analog, such as PNA, may be of genomic or synthetic origin, may be 
either double-stranded or single-stranded, and can represent either the sense or antisense 
(complementary) strand. 

35 "Oligomer" refers to a nucleic acid sequence of at least about 6 nucleotides and as many as 

about 60 nucleotides, preferably about 15 to 40 nucleotides, and most preferably between about 20 
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sequences, one may use BLASTN with the ^BLAST 2 Sequences" tool Version 2.0.9 (May-07-1999) 

set at default parameters. Such default parameters may be, for exanqile: 

Matrix: BLOSUM62 

Reward for match: 1 

5 Penalty formisniatch: '2 

Open Gap: 5 and Extension Gap: 2 penalties 

Gap X drop-off: 50 

fjq^ect: 10 

Word Size: 11 

10 Filter: on 

Percent identity may be measured over the length of an entire defined sequence, for example, 
as defined by a particular SEQ ID number, or may be measured over a shorter length, for exan5>le, 
over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at 
least 20, at least 30, at least 40, at least 50. at least 70, at least 100, or at least 200 contiguous 

15 nucleotides. Such lengths are exenq)lary only, and it is understood that any fragment length 

supported by the sequences shown herein, in figures or Sequence Listings, may be used to describe a 
length over which percentage identity may be measured. 

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode 
similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes 

20 in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid 
sequences that all encode substantially the same protein. 

The phrases "percent identity" and "% identity", as applied to polypeptide sequences, refer to 
the percentage of residue matches between at least two polypeptide sequences aligned usmg a 
standardized algorithm. Methods of polypeptide sequence alignment are well-known. Some 

25 alignment methods take into account conservative amino acid substitutions. Such conservative 

substitutions, explained in more detail above, generally preserve the hydrophobicity and acidity of the 
substituted residue, thus preserving the structure (and therefore function) of the folded polypeptide. 

Percent identity between polypeptide sequences may be determined using the default 
parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3.12e 

30 sequence alignment program (described and referenced above). For pairwise alignments of 

polypeptide sequences using CLUSTAL V, the default parameters are set as follows: Ktuple=l, gap 
penalty=3, window=5, and "diagonals saved"=5. The PAM250 matrix is selected as the default 
residue weight table. As with polynucleotide alignments, the percent identity is reported by 
CLUSTAL V as the "percent similarity" between aligned polypeptide sequence pairs. 

35 Alternatively the NCBI BLAST software suite may be used. For example, for a pairwise 

comparison of two polypeptide sequences, one may use the "BLAST 2 Sequences" tool Version 2.0.9 
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Biology. 4* ed., John Wiley & Sons, New York NY), and Innis, M. et al. (1990; PGR Protocols. A 

Guide to Methods and Applications. Academic Press, San Diego CA). PGR primer pairs can be 

derived from a known sequence, for exa]iq)le, by using computer programs intended for that purpose 

such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge MA). 

5 Oligonucleotides for use as primers are selected using software known in the art for such 

purpose. For exan5)le, OLIGO 4.06 software is useful for the selection of PGR primer pairs of up to 

100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 

5,000 nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer 

selection programs have incorporated additional features for expanded capabilities. For example, the 

10 PrimOU primer selection program (available to the public from the Genome Center at University of 
Texas South West Medical CJenter, Dallas TX) is capable of choosing specific primers from 
megabase sequences and is thus useful for designing primers on a genome-wide scope. The PrimerS 
primer selection program (available to flie public from the Whitehead Listitute/MIT Center for 
Genome Research, Cambridge MA) allows the user to input a **mispriming library," in which 

15 sequences to avoid as primer binding sites are user-specified. PrimerS is useful, in particular, for the 
selection of oligonucleotides for noicroarrays. (The source code for the latter two primer selection 
progranos may also be obtained from their respective sources and modified to meet the user's specific 
needs.) The Prime(}en program (avaUable to the public from the UK Human Genome Mapping 
Project Resource Centre, Cambridge UK) designs primers based on multiple sequence alignments, 

20 thereby allowing selection of primers that hybridize to either the most conserved or least conserved 
regions of aligned nucleic acid sequences. Hence, this program is useful for identification of both 
unique and conserved oligonucleotides and polynucleotide fragments. The oligonucleotides and 
polynucleotide fragments identified by any of the above selection methods are useful in hybridization 
technologies, for example, as PCR or sequencing primers, microarray elements, or specific probes to 

25 identify fully or partially complementary polynucleotides in a sample of nucleic acids. Methods of 
oligonucleotide selection are not limited to those described above. 

'Turified" refers to molecules, either polynucleotides or polypeptides that are isolated or 
separated from their natural environment and are at least 60% free, preferably at least 75% free, and 
most preferably at least 90% free from other compounds with which they are naturally associated. 

30 A "recombinant nucleic acid" is a sequence that is not naturally occurring or has a sequence 

that is made by an artificial combination of two or more otherwise separated segments of sequence. 
This artificial combination is often accomplished by chemical synthesis or, more commonly, by the 
artificial naanipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques 
such as those described in Sambrook, supra . The term recombinant includes nucleic acids that have 

35 been altered solely by addition, substitution, or deletion of a portion of the nucleic acid. Frequently, a 
recombinant nucleic acid may include a nucleic acid sequence operably linked to a promoter 
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*Transf onnation" refers to a process by which exogenous DNA enters a recipient cell. 

Transf onnation may occur under natural or artificial conditions using various methods well known in 

the art. Transformation may rely on any known method for the insertion of foreign nucleic acid 

sequences into a prokaryotic or eukaryotic host cell. The method is selected based on the host cell 

5 being transformed. 

'Transformants" include stably transformed cells in which the inserted DNA is capable of 

replication either as an autonomously replicatmg plasmid or as part of the host chromosome, as well 

as cells which transiently express inserted DNA or RNA- 

A "transgenic organism," as used herein, is any organism, including but not Innited to animals 

10 and plants, in which one or more of the cells of the organism contains heterologous nucleic acid 
introduced by way of human intervention, such as by transgenic techniques well known in the art. 
The nucleic acid is introduced into the cell, directly or mdirectly by introduction into a precursor of 
the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with a 
recombinant virus. The term genetic manipulation does not include classical cross-breeding, or in 

15 vitro fertilization, but rather is directed to the introduction of a recombinant DNA molecule. The 
transgenic organisms contenq)lated in accordance with the present invention include bacteria, 
cyanobacteria, fungi, and plants and animals. The isolated DNA of the present invention can be 
introduced into the host by methods known in the art, for example infection, transfection, 
transformation or transconjugation. Techniques for transferring the DNA of the present invention 

20 into such organisms are widely known and provided in references such as Sambrook et al. (1989), 
supra . 

A "variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having 
at least 25% sequence identity to the particular nucleic acid sequence over a certain length of one of 
the nucleic acid sequences using BLASTN with the "BLAST 2 Sequences" tool Version 2.0.9 (May- 

25 07-1999) set at default parameters. Such a pair of nucleic acids may show, for example, at least 30%, 
at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 91%, at least 92%, at least 
93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater 
sequence identity over a certain defmed length. The variant may result in "conservative" amino acid 
changes which do not affect structural and/or chemical properties. A variant may be described as, for 

30 example, an "allelic" (as defmed above), "splice," "species," or "polymorphic" variant. A splice 
variant may have significant identity to a reference molecule, but will generally have a greater or 
lesser number of polynucleotides due to altemate splicing of exons during mRNA processing. The 
corresponding polypeptide may possess additional functional domains or lack domains that are 
present in the reference molecule. Species variants are polynucleotide sequences that vary from one 

35 species to another. The resulting polypeptides generally will have significant amino acid identity 
relative to each other. A polymorphic variant is a variation in the polynucleotide sequence of a 
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indicated by a probability score in column 4, and the functional annotation corresponding to each 

GeuBank hit is listed in column 5. 

An alternative embodiment utilizes cDNA sequences disclosed in publically available DNA 

sequence databases (for exan5)le, GenBank, National Cent^ for Biotechnology Information (NCBI), 

5 Bethesda» MD) as well as cDNA sequences derived from human tissues and cell lines which have 

been aligned to genomic contigs obtained from NCBI to identify cDNA transcripts. The cDNA 

transcript tenplate sequences are identified by the Tncyt^ identification numbers (Incyte IDs) in 

column 2 of Table 2. The sequence identification numbers (SEQ ID NO:s) corresponding to the 

cDNA transcript ten^late IDs are shown in colunm 1. The cDNA transcript template sequences have 

10 similarity to GenBank sequences, or "hits,*' as designated by the GI Numbers in column 3. The 
statistical probability of each GenBank hit is indicated by a probability score in column 4, and the 
functional annotation corresponding to each GenBank hit is listed in column S. 

The invention incorporates the nucleic acid sequences of these ten^lates as disclosed in the 
Sequence Listing and the use of these sequences in the diagnosis and treatment of disease states 

15 characterized by defects in hiuiian molecules. The invention further utilizes these sequences in 
hybridization and amplification technologies, and in particular, in technologies which assess gene 
expression patterns correlated with specific cells or tissues and their responses in vivo or in vitro to 
pharmaceutical agents, toxins, and other treatments. In this manner, the sequences of the present 
invention are used to develop a transcript image for a particular cell or tissue. 

20 Derivation of Nucleic Acid Sequences 

cDNA was isolated from libraries constructed using RNA derived from normal and diseased 
human tissues and cell lines. The human tissues and cell lines used for cDNA library construction 
were selected from a broad range of sources to provide a diverse population of cDNAs representative 
of gene transcription throughout the human body. Descriptions of the human tissues and cell lines 

25 used for cDNA library construction are provided in the LIFESEQ database (Incyte Corporation 
(Incyte), Palo Alto, CA). Human tissues were broadly selected from, for example, cardiovascular, 
dermatologic, endocrine, gastrointestinal, hematopoietic/immune system, musculoskeletal, neural, 
reproductive, and urologic sources. 

Cell lines used for cDNA library construction were derived from, for example, leukemic 

30 cells, teratocarcinomas, neuroepitheliomas, cervical carcinoma, lung fibroblasts, and endothelial ceUs. 
Such cell Unes include, for example, THP-1, Jurkat, HUVEC, hNT2, WI38, HeLa, and otiier cell 
lines commonly used and available from public depositories (American Type Culture Collection, 
Manassas VA). Prior to ruRNA isolation, cell lines were untreated, treated with a pharmaceutical 
agent such as 5'-aza-2'-deoxycytidine, treated with an activating agent such as lipopolysaccharide in 

35 the case of leukocytic cell lines, or, in the case of endothelial cell lines, subjected to shear stress. 
Sequencing of the cDNAs 
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"""^aSSDNA sequencmg are weUknown inthe axt Ctonventi^^ 
en^loytheKlenowftagmentofDNApolyrn«raseI.SEQIJENASEDNApolyxr«rase^^^^ 

Biochemical Corporation, Qeveland OH). Taq polymerase (AppUed Biosystems, Fosl^ Qty CA) 
thermostableT7polyn«rase(Arnerd^PhanmciaBiotech.Inc.(^ 

Piscataway NJ), or con*inations of polymerases and proofteading exonucleases such as those found 

in theELONGASEampMcation system OifeTechnologies inc. (LifcTechnol^^^^^ 

MD), to extend thenucleic acid sequencefem an oHgonucleotideprimeramiealed to theDNA 

template of interest Methods have been developedfor the use of both single-stranded^^ 

stranded templates. Oiain termination reaction products may be electrophomsed on urea- 

, polyacrylamide gels and detected either by autoradiography (for radioisotope-labeled nucleoudes) or 
by fluorescence (for fluoropho«-labeled nucleotides). Automated methods for mechanized reaction 
preparation, sequencing, and analysis using fluorescence detection methods have been developed. 
Machines used to prepare cDNAs for ^uencmg can include the MICROLAB 2200 liquid transfer 
system (Hamilton Company (Hamilton), Reno NV). Peltier thermal cycler (FrC200; MJ Research. 

5 mc. (MJ Research). Watertow MA), and ABI CATALYST 800 .thermal cycler (Apphed 
Biosystems). Sequencing can be carried out using, for example, the ABI 373 or 377 (Applied 
Biosystems) or MEGABACE 1000 (Molecular Dynamics. Mc. (Molecular Dynamics), Sunnyvale 
CA) DNA sequencing systems, or other automated and manual sequencing systems weU known m the 
art. 

20 When multiple clones are determined to require complete insert sequencing, the clones are 

pooled into 'shot-gun' Ubraries. The cDNA inserts from these pools are ampUfied by PGR. 
xnechanically sheared into smaller pieces and cloned into plasmid vectors for vector pnmer 
sequencing. Assembly of the nucleic acid sequences of the small pieces into their respective parent 

full-length insert can then be accompUshed using sequence assembly programs such as PHRAP 
25 phrap.org/phrap.docs/phrap.htinl). 

AdditionaUy, when DNA sequencing primers derived from cloning vectors generate only 
nucleic acid sequence coverage of the cDNA insert ends of a clone, the complete sequence of the 
internal regions of cDNA inserts can be achieved by performing DNA sequencing usmg gene specrfic 
primers and ^rimer-walldng" methods (Shyamala. V. and G.F. Ames (1989) Gene 84^ ^ 
30 incorporatedbyreferenceinitsentirety). Primer-walldng is carried out in iterative cycles until the 
primer-walk sequences can be assembled into a non-gapped, contiguous consensus sequence. 

The nucleotide sequences of the Sequence Listing have been prepared by current, state^f- 
the-art automated methods and. as such, may conUin occasional sequencing eoors or unidentified 
nucleotides. Such unidentified nucleotides are designated by an N. Tliese infrequent unidentified 
35 basesdonotrepresantahmdrancetopracticmgtheinventionforthoseskilledintheart Several 
niethods employing standard recombinant techniques may be used to correct errors and complete the 
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missing sequence information. (See, e.g., those described in Ausubel, F.M. et al. (1997) Short 

Protocols in Molecular Biology. John Wiley i& Sons, New York NY; and Sambrook. J. et al. (1989) 

Molecular c nonin p, A Laboratorv Manual. Cbld Spring Harbor Press, Plamview NY.) 

cDNA Alignment To Gfenomic Contips 

5 cDNA transcript templates found in public databases (e.g. (jenBank) and EST sequences 

generated by Incyte Corporation were aligned to the human genome. The human genomic contigs 

used were publically available from the National Clenter for Biotechnology Information (NCSI). A 

proprietary algorithm, IRISS (Ihcyte Research Informatics Sequence Search) was used as a 

prelitoinary step to define the cDNA sequence /masked genomic DNA contig pairings. IRISS was 

10 designed to match all cDNAs in a large database to one or more loci within the human genome. This 
is achieved by first, identifying all exact matches of 21 bp in length or greater between the cDNAs 
and the genomic sequence, and secondly, combining these exact matches into hits. Comparable 
pairings can be achieved using publically available alignment algorithms such as MEGABLAST 
(Zhang. Z. et al: (2000) J. Conqiut. Biol. 7:203-214). A pairing occurred if 50% of the length of the 

15 cDNA sequence was aligned. The cDNA/ genomic pairings identified by IRISS were analyzed using 
the SIM4 alignment algorithm (version May 2000 with optimization for high throughput and strand 
assignment confidence, Florea et al., (1998) Genome Res. 8:967-974). For cDNAs which hit multiple 
genomic locations, only the SIM4 alignment providing the highest percent identity was retained. 

The SIM4 results were then analyzed by determining alignment quality, strand assignment 

20 and polyA location. The alignment quality was assessed first by examining the terminal exons of 
cDNAs. Terminal exons were cleaved if the exon was less than 9 bp in length or the intron length 
exceeded 40 Kb and a certain exon length and percent identity threshold was not met (as determined 
by the intron length). Tenninal exons were also cleaved if the exons appeared to be derived from 
poly A tails. Any cDNA sequence meeting any of the following criteria was determined to be a false 

25 result: i) a gap of more than 5 bp present within the aligned cDNA/genomic pairing; ii) the global 
identity or coverage of the alignment was below 95%; or iii) the global coverage length of the 
alignment was less than 50 bp. 

Strand assignments for cDNAs containing multiple exons were derived from SIM4. 
Experimentally determined cDNA sequence read direction was used for single exon cDNAs, if 

30 available. Single exon cDNAs with no lab read direction utilized read directions predicted by 

ESTScan (Iseli, C. et al. (1999) Roc. hit. Conf. htell. Syst. Mol. Biol. AAAI, 1999:138-148) if such 
predictions were available. When neither an experimental sequence read nor an ESTScan prediction 
was available, strand assignment was either based on overlap with an assigned cDNA, or the cDNA 
was assigned by default to the positive strand. 

3 5 Detennination of the polyA site was performed by examining an area of the cDNA 35 bp 

upstream to 120 bp downstream of the 3' end and searching for the presence of the eleven known 
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excluded. If a putative gene boimd contained less than three single exon ESTs (a gene with shallow 

EST coverage) and completely overlapped an opposite strand gene which contain greater than one 

clone (a gene had deeper coverage), the ESTs from the shallow gene were re-assigned to the opposite 

strand and incorporated into the deeper, opposite strand gene. 

5 Generation of cDNA Transcript Templates 

cDNA transcripts were generated within gene bounds by a multi-step process that first 

modified the start and stop coordinates of cDNAs in order to generate transcripts which were devoid 

of hcRNA (heteronuclear RNA) contamination and potentially faulty misalignments due to SIM4. 

These modifications included removing cDNA transcripts containing hnRNA or those cDNA 

10 transcripts which were misaligried. hnRNA was identified as start/stop coordinates of one cDNA 
transcript appearing within the intron of another cDNA transcript. Correction involved moving the 
coverage start or coverage stop locations of the cDNA transcript to the next nearest splice start or 
stop, or moving to a second more distant splice start and stop if both cDNA transcripts in question 
shared the same splice start/stop next to the coverage start/stop in question. 

15 cDNAs were grouped together for misalignment analysis if they had the same intron size (+/- 

30 bp) and the splice site distance was +/- 30 bp between the position of two cDNAs. Introns were 
identified as misaligned if the splice site window identity average was greater than 98% (window size 
10 bp on each side of splice site) and the intron had a total depth of greater than or equal to 5% (of 
cDNAs in the cluster). The depth was calculated from the number of cDNAs used as evidence for 

20 this splicing event over all cDNAs used in the analysis and whether a canonical splice site was 

present, or the splice site window identity of any cDNA was 100%, regardless of depth and score. If 
none of these criteria were met, the splice sites of the cDNAs were adjusted so that their intron was 
the same as the splice site represented by the majority of cDNAs. 

After modification, the transcripts are generated from the cDNAs in the following manner. 

25 The cDNAs are represented as nodes in a directed acyclic graph (DAG). Each node is linked in a 
directed manner to other nodes if the cDNA of the original node can be extended by the cDNAs 
represented by the other nodes. Extension is defined by "extending a cDNA in the 3' direction" if the 
extending cDNA has a subset of exons which in turn are a subset of the original cDNA exons. This 
process is called "matmg". Whenever a cDNA can be mated to another, a directed link is created in 

30 the DAG. After all such links are created, transcript generation can proceed. 

A transcript is generated by starting at a node that has no incoming links, (current cDNA 
cannot be extended 5' by another cDNA), following an outgoing link (current cDNA can be extended 
3' by another cDNA) to the next node, and repeatmg that process untD finaUy a node is reached which 
has no outgoing links (current cDNA cannot be extended 3' by another possible cDNA). This is 

35 called "traversing" the DAG. All the distinct exons encountered by such a traversal in going from 
node to node (i,e., from cDNA to cDNA) are then assembled into the exons of a transcript. All 

131 



10 



PCTAJS2003/028227 

dis«rtiS'?U« DAG (dU.ma »u of .xons) ..ad » 0« 8»»«ion of H dW.« »mschp« 

for that gene bound. 

AR.QP.mhW of cHNA Sequences 

Humanpolynucleotide sequences rnay be assenAledusingprograxns or algontto 

kr^o^ in the ^. sequences tobeassen*ledarerela.«d,whony or inpart,andn.y 

asingleormanydifferenttranscripts. Assembly of the sequences can be perfonnedusmg such. 

progran. as PHRAP (Phils Revised Assen^ly Program) and the GELVI^ 

system (GCG), or other methods known in the art ^, . . 

Alternatively. cDNA sequences are used as "component" sequences that are assembled mu> 
.^n^plate" or "consensus" sequences as follows. Sequence chromatograms axe processed, venfied, 
andquaUty scores areobtainedusingFHRED. Raw sequences are edited using an editMgpa^w^^ 
lozown asBlockl(See,e.g..theUEESEQAssembledUser Guide. lhcyte).AseriesofB^^^^^^^ 

comparisonsis performed and low-information.gmentsandrepetitiveel^^^^^ 
repeats. Alurepeats, etc.) are replaced by Vs", or masked, toprevent spurious matches. 
Mitochondrial ^d ribosomal RKA sequences are also removed. The processed sequences axe then 
loaded into a relational database management system (RDMS) which assigns edited sequences to 
existingtemplates,if available. When additional sequences are addedintothe^^^^ 
i^UatedwMchmodifies existing templates or creates new templates ftom works inpro^^^^^^^ 
.on&«lassembledsequences)containing queued sequences or the sequences the^^^^^^^ 
, newsequenceshavebeenassignedtotemplates.thetemplatescanbemergedintobins. lfmult.ple 
templatesexistinonebm.thebincanbesplitandthetemplatesreannotated. 

Once genebins have been generated based upon sequence alignments, bins are clone jomed 
baseduponcloneinformaUon.aonejoin^g occurs when the 5' sequenceofonecloneispr^ntm 

onebin and the3- sequence fromthesameclone is present inadifferentbin.indicattngthatthe two 
s binsshouldbemergedintoasinglebin. Only bins which share at least two different clones are 

'"'''''Aresultanttranscprit template sequence may contain eitherapartialoraMlen^^ 
.eadingframe.oranorpartof ageneticregulatoryelement. Tins variation is due in p^ to the fa.t 
that the full lengthcDNAs of many genes are severalhundred. and sometimes several thousand, b^ 
30 inlength. With current technology. cDNAs comprising the codmg regions of large genes cam.ot be 
cloned because of vector limitations, incon^letereversetranscriptionofthe^^^ 
"second strand" synthesis. Template sequencesmay be extended to includeadditio^ 
sequences derived fromthe parent RNA transcript usingavariety of methodsknown to thoseof^^ 

in the art. Extension may thus be used to achieve the full length coding sequence of a gene. 
35 Anal ysis nf the cDNA Sequ^ces 
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The cDNA sequences are analyzed using a variety of programs and algorithms which are well 

known in the art. (See, e,g., Ausubel, 1997, supra , C3iapter 7.7; Meyers, R.A. (Ed.) (1995) Molecular 

Biology and Biotechnolo^. Wiley VCH, New York NY, pp. 856-853; and Table 6.) These analyses 

comprise both reading frame determinations, e.g., based on triplet codon periodicity for particular 

5 organisms (Fickett, J,W. (1982) Nucleic Acids Res. 10:5303-53 18); analyses of potential start and 

stop codons; and homology searches. 

Computer programs known to those of skill in the art for performing computer-assisted 

searches for amino acid and nucleic acid sequence similarity, hiclude, for example, Basic Local 

Alignment Search Tool (BLAST; Altschul. S.F. (1993) J. Mol. Evol. 36:290-300; Altschul, S.R et al. 

10 (1990) J. Mol. Biol. 215:403-410). Gapped BLAST (Altschul, SJF., Madden, T.L., SchSffer, A.A., 
Zhang. J., Zhang, Z, Miller, W. & Lipman, D.J. (1997) "Gapped BLAST and PSI-BLAST: a new 
generation of piotein database search programs." Nucleic Acids Res. 25:3389-3402) is especially 
useful in determining exact and gapped matches and by comparing two sequence firagments of 
arbitrary but equal lengths, whose alignment is locally maximal and for which the alignment score 

15 meets or exceeds a threshold or cutoff score set by the user (Karlin, S. et al. (1988) Proc. Natl. Acad. 
Sci. USA 85:841-845). Using an appropriate search tool (e.g., BLAST or HMM), GenBank. 
SwissProt, PFAM, Protein Data Bank (PDB) and other databases may be searched for sequences 
containing regions of homology to a query dithp or DITHP of the present invention. 

Other approaches to the identification, assembly, storage, and display of nucleotide and 

20 polypeptide sequences are provided in "Relational Database for Storing Biomolecule Information," 
U.S.S.N. 08/947,845, filed October 9, 1997; 'Troject-Based Full-Length Biomolecular Sequence 
Database," U.S. Patent No. 5,953,727; and "Relational Database and System for Storing Information 
Relating to Biomolecular Sequences,'* U.S. Patent No. 6,553,317, all of which are incorporated by 
reference herein in their entirety. 

25 Protein hierarchies can be assigned to the putative encoded polypeptide based on, e.g.» motif, 

BLAST, or biological analysis. Methods for assigning these hierarchies are described, for example, 
in "Database System Employing Protein Function Hierarchies for Viewing Biomolecular Sequence 
Data," U.S. Patent No. 6,023,659, incorporated herein by reference. Gene Ontology assignments can 
also be made based on these tools (godatabase.org, Ashbumer, M., et al. (2000) Nature Genet. 

30 25:25-29). 

Protein Translation Prediction From cDNA Transcripts 

The cDNA transcript template sequences were further analyzed by translating each template 
using BLASTX against either the SwissProt or GenPept (version 130) databases, saving those hits 
with an E-value less than or equal to le-45. Transcripts having a predicted protein were evaluated by 

35 both a global alignment-based translation method and a maximal size ORF (Open Reading Frame)- 
based translation method. In the global alignment method, the transcript was realigned to its top 
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Sequences of Human Diagnostic and Therapeutic Molecules 

The dithp of the present invention may be used for a variety of diagnostic and therapeutic 

25 purposes. For example, a dithp may be used to diagnose a particular condition, disease, or disorder 
associated with human molecules. Such conditions, diseases, and disorders include, but are not 
limited to, a cell proliferative disorder, such as actinic keratosis, arteriosclerosis, atherosclerosis, 
bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal 
nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and cancers 

30 including adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarciuoma, 
and, in particular, a cancer of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, 
gall bladder, ganglia, gastromtestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, 
parathyroid, penis, prostate, saUvary glands, skin, spleen, testis, thymus, thyroid, and uterus; an 
autoimmime/inflammatory disorder, such as inflammation, actinic keratosis, acquired 

35 immunodeficiency syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, 
allergies, ankylosing spondylitis, amyloidosis, anenwa, arteriosclerosis, asthma, atherosclerosis, 
autoinmiune hemolytic anemia, autoimmune thyroiditis, bronchitis, bursitis, cholecystitis, cirrhosis. 
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disease, Letteier-Siwe disease, sarcoidosis, empty sella syndrome, and dwarfism; a disorder 

associated with hyperpituitarism including acromegaly, giantisn!!, and syndrome of inappropriate 

antidiuretic hormone (ADH) secretion (SIADH) often caused by benign adenoma; a disorder 

associated with hypothyroidism including goiter, myxedema, acute thyroiditis associated with 

bacterial infection, subacute thyroiditis associated with viral infection, autoimmune thyroiditis 

(Hashimoto's disease), and cretinism; a disorder associated with hyperthyroidism including 

thyrotoxicosis and its various forms. Grave's disease, pretibial myxedema, toxic multinodular goiter, 

thyroid carcinoma, and Plunnner's disease; a disorder associated with hyperparathyroidism including 

Conn disease (chronic hypeicalenaia); a pancreatic disorder such as Type I or Type II diabetes 

mellitus and associated con?>lications; a disorder associated with the adrenals such as hyperplasia, 

carcmoma, or adenoma of the adrenal cortex, hypertension associated with alkalosis, amyloidosis, 

hypokalemia, Cushing's disease, Liddle's syndrome, and Amold-Healy-Gordon syndrome, 

pheochromocytoma tumors, and Addison's disease; a disorder associated with gonadal steroid 

hormones such as: in women, abnormal prolactm production, infertility, endometriosis, perturbation 

of the menstrual cycle, polycystic ovarian disease, hyperprolactinemia, isolated gonadotropin 

deficiency, amenorrhea, galactorrhea, hermaphroditism, hirsutism and virilization, breast cancer, and, 

in post-menopausal women, osteoporosis; and, in men, Leydig cell deficiency, male climacteric 

phase, and germinal cell aplasia, a hypergonadal disorder associated with Leydig cell tumors, 

androgen resistance associated with absence of androgen receptors, syndrome of 5 a-reductase, and 

gynecomastia; a metabolic disorder such as Addison's disease, cerebrotendinous xanthomatosis, 

congenital adrenal hyperplasia, coumarin resistance, cystic fibrosis, diabetes, fatty hepatocirrhosis, 

fructose-l,6-diphosphatase deficiency, galactosemia, goiter, glucagonoma, glycogen storage diseases, 

hereditary fructose intolerance, hyperadrenalism, hypoadrenalism, hyperparathyroidism, 

hypoparathyroidism, hypercholesterolemia, hyperthyroidism, hypoglycemia, hypothyroidism, 

hyperlipidemia, hyperlipemia, lipid myopathies, Upodystrophies, lysosomal storage diseases, 

mannosidosis, neuraminidase deficiency, obesity, pentosuria phenylketonuria, pseudovitamin D- 

deficiency rickets; disorders of carbohydrate metabolism such as congenital type 11 dyserythropoietic 

anemia, diabetes, msulin-dependent diabetes mellitus, non-insulin-dependent diabetes mellitus, 

fructose-l,6-diphosphatase deficiency, galactosemia, glucagonoma, hereditary fructose intolerance, 

hypoglycemia, mannosidosis, neuraminidase deficiency, obesity, galactose epimerase deficiency, 

glycogen storage diseases, lysosomal storage diseases, fiiictosuria, pentosuria, and inherited 

abnormalities of pyruvate metabolism; disorders of lipid metabolism such as fatty liver, cholestasis, 

primary biliary cirrhosis, carnitine deficiency, camitme palmitoyltransferase deficiency, 

myoadenylate deaminase deficiency, hypertriglyceridemia, lipid storage disorders such Fabry's 

disease, Gaucher' s disease, Niemann-Pick's disease, metachromatic leukodystrophy, 

adienoleukodystrophy, GMj gangliosidosis, and ceroid lipofuscinosis, abetalipoproteinemia, Tangier 
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QT syndrome, myocarditis, cardiomyopathy, nemaline myopathy, centronuclear myopathy, lipid 

myopathy, mitochondrial myopathy, thyrotoxic myopathy, ethanol myopathy, dermatomyositis, 

inclusion body myositis, mfectious myositis, and polymyositis, neurological disorders associated 

with transport, e.g., Alzhehner's disease, amnesia, bipolar disorder, dementia, depression, ^ilepsy, 

Tourette's disorder, paranoid psychoses, and schizophrenia, and other disorders associated witii 

transport, e.g., neurofibromatosis, postherpetic neuralgia, trigeminal neuropathy, sarcoidosis, sickle 

cell anemia, cataracts, infertility, pulmonary artery, stenosis, sensorineural autosomal deafiaess, 

hyperglycemia, hypoglycemia. Grave's disease, goiter, glucose-galactose malabsorption syndrome, 

hypercholesterolemia, Gushing* s disease, and Addison's disease; and a connective tissue disorder 

such as osteogenesis imperfecta, Ehlers-Danlos syndrome, chondrodysplasias, Marfan syndrome, 

Alport syndrome, familial aortic aneurysm, achondroplasia, mucopolysaccharidoses, osteoporosis, 

osteopetrosis, Paget's disease, rickets, osteomalacia, hyperparathyroidism, renal osteodystrophy, 

osteonecrosis, osteomyelitis, osteoma, osteoid osteoma, osteoblastoma, osteosarcoma, 

osteochondroma, chondroma, chondroblastoma, chondromyxoid fibroma, chondrosarcoma, fibrous 

cortical defect, nonossifying fibroma, fibrous dysplasia, fibrosarcoma, malignant fibrous 

histiocytoma, Swing's sarcoma, primitive neuroectodermal tumor, giant cell tumor, osteoarthritis, 

rheumatoid arthritis, ankylosing spondyloarthritis, Reiter's syndrome, psoriatic arthritis, enteropathic 

arthritis, infectious arthritis, gout, gouty arthritis, calcium pyrophosphate crystal deposition disease, 

ganglion, synovial cyst, villonodular synovitis, systemic sclerosis, Dupuytren's contracture, hepatic 

fibrosis, lupus erythematosus, mixed connective tissue disease, epidermolysis bullosa simplex, 

bullous congenital ichthyosiform erythroderma (epidermolytic hyperkeratosis), non-epidermolytic 

and epidermolytic palmoplantar keratoderma, ichthyosis bullosa of Siemens, pachyonychia congenita, 

and white sponge nevus. The dithp can be used to detect the presence of, or to quantify the amount 

of, a dithp-related polynucleotide in a sample. This information is then compared to information 

obtained from appropriate reference samples, and a diagnosis is established. Alternatively, a 

polynucleotide complementary to a given dithp can mhibit or inactivate a therapeutically relevant 

gene related to the dithp. 

Analvsis of dithp Expression Patterns 

The expression of dithp may be routinely assessed by hybridization-based methods to 

determme, for example, the tissue-specificity, disease-specificity, or developmental stage-specificity 

of dithp expression. For exanqjle, the level of expression of dithp may be compared among different 

cell types or tissues, among diseased and normal cell types or tissues, among cell types or tissues at 

different developmental stages, or among cell types or tissues undergoing various treatments. This 

type of analysis is useful, for example, to assess the relative levels of dithp expression in fully or 

partially differentiated cells or tissues, to determine if changes in dithp expression levels are 

correlated with the development or progression of specific disease states, and to assess the response 
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radioactive and chemiluininescent labeling (Annersham Pharmacia Biotech) and for alkaline 

phosphatase labeling (life Technologies). Alternatively, dithp may be cloned into commercially 

available vectors for the production of RNA probes. Such probes may be transcribed in the presence 

of at least one labeled nucleotide (e.g., ^^P-ATP, Amersham Pharmacia Biotech). 

5 Additionally the polynucleotides of SEQ ID NO: 1-2722 or suitable fragments thereof can be 

used to isolate fiill length cDNA sequences utilizing hybridization and/or amplifilcation procedures 

well known m flie art, e.g., cDNA library screenmg, PGR amplification, etc. The molecular clonmg 

of such fiill length cDNA sequences may en?)loy tiie method of cDNA library screenmg witii probes 

using the hybridization, stringency, washing, and probing strategies described above and in Ausubel, 

10 su pra. Chapters 3, 5, and 6. These procedures may also be enq>loyed witii genomic libraries to isolate 

genomic sequences of dithp in order to analyze, e.g., regulatory elements. 

Genetic Mapping 

Gene identification and mapping are important in the investigation and treatment of almost all 
conditions, diseases, and disorders. Cancer, cardiovascular disease, Alzheimer's disease, arthritis, 

15 diabetes, and mental ilhiesses are of particular interest. Each of these conditions is more complex 
than ttie single gene defects of sickle cell anemia or cystic fibrosis, with select groups of genes being 
predictive of predisposition for a particular condition, disease, or disorder. For example, 
cardiovascular disease may result firotn malfunctioning receptor molecules that fail to clear 
cholesterol from tiie bloodstream, and diabetes may result when a particular individual's immune 

20 system is activated by an infection and attacks the insulin-producing cells of the pancreas. In some 
studies, Alzheimer's disease has been linked to a gene on chromosome 21; other studies predict a 
different gene and location. Mapping of disease genes is a complex and reiterative process and 
generally proceeds from genetic Unkage analysis to physical mapping. 

As a condition is noted among members of a family, a genetic linkage map traces parts of 

25 chromosomes that are inherited in the same pattern as the condition. Statistics link the inheritance of 
particular conditions to particular regions of chromosomes, as defined by RFLF or other markers. 
(See, for example. Lauder, E. S. and Botstein, D. (1986) Proc. Nati. Acad. Sci. USA 83:7353-7357.) 
Occasionally, genetic markers and their locations are known from previous studies. More often, 
however, the markers are simply stretches of DNA that differ among individuals. Examples of 

30 genetic linkage maps can be found in various scientific journals or at the Online Mendelian 
Inheritance in Man (OMIM) World Wide Web site. 

In another embodiment of the invention, dithp sequences may be used to generate 
hybridization probes useful in chromosomal mapping of naturally occurring genomic sequences. 
Either coding or noncoding sequences of dithp may be used, and in some instances, noncoding 

35 sequences may be preferable over coding sequences. For example, conservation of a dithp coding 
sequence among members of a multi-gene family may potentially cause undesired cross hybridization 
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The dithp of the present mvention may he used to design probes useful in diagnostic assays. 

Such assays, well known to those skilled in the art, may be used to detect or confirm conditions, 

disorders, or diseases associated with abnormal levels of dithp expression. Labeled probes developed 

from dithp sequences are added to a sanqjle under hybridizing conditions of desired stringency. In 

5 some instances, dithp, or fragments or oligonucleotides derived from dithp, imy be used as primers in 

amplification steps prior to hybridization. The amount of hybridization complex formed is quantified 

and coiiq)ared with standards for that ceD or tissue. If dithp expression varies significantly from the 

standard, the assay indicates the presence of the condition, disorder, or disease. Qualitative or 

quantitative diagnostic methods may include northern, dot blot, or other membrane or dip-stick based 

10 technologies or multiple-sain>le format technologies such as PGR, enzyme-linked immunosorbent 
assay (ELISA)-like, pin, or chip-based assays. 

The probes described above may also be used to monitor the progress of conditions, 
disorders, or diseases associated with abnormal levels of dithp expression, or to evaluate the efficacy 
of a particular therapeutic treatment. The candidate probe may be identified from the dithp that are 

15 specific to a given human tissue and have not been observed in GenBank or other genome databases. 
Such a probe noay be used in animal studies, preclinical tests, clinical trials, or in monitoring the 
treatment of an mdividual patient. In a typical process, standard expression is established by methods 
well known in the art for use as a basis of comparison, samples from patients affected by the disorder 
or disease are combined with the probe to evaluate any deviation from the standard profile, and a 

20 therapeutic agent is administered and effects are monitored to generate a treatment profile. Efficacy 
is evaluated by determining whether the expression progresses toward or returns to the standard 
normal pattern. Treatment profiles may be generated over a period of several days or several months- 
Statistical methods well known to those skilled in the art may be use to determine the significance of 
such therapeutic agents. 

25 The polynucleotides are also useful for identifying individuals from minute biological 

samples, for example, by matching the RFLP pattern of a sample's DNA to that of an individual's 
DNA. The polynucleotides of the present invention can also be used to determine the actual 
base-by-base DNA sequence of selected portions of an individual's genome. These sequences can be 
used to prepare PGR primers for amplifying and isolating such selected DNA, wliich can then be 

3 0 sequenced. Using this technique, an individual can be identified through a unique set of DNA 

sequences. Once a unique ID database is established for an individual, positive identification of that 
individual can be made from extremely small tissue samples. 

In a particular aspect, oligonucleotide primers derived from the dithp of the invention may be 
used to detect single nucleotide polymorphisms (SNPs). SNPs are substitutions, insertions and 

35 deletions that are a frequent cause of inherited or acquired genetic disease in humans. Methods of 
SNP detection include, but are not limited to, single-stranded conformation polymorphism (SSCP) 
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neomycin pnospnoiransierase gene (neo; Capecchi, M.R, (1989) Science 244:lz56-izyz). ine vector 

integrates into the corresponding region of the host genome by homologous recombination. 

Alternatively, homologous recombination takes place using the Cre-loxP system to knockout a gene 

of interest in a tissue- or developmental stage-specific manner (Marth, J.D. (1996) Clin. Invest 

' 5 97:1999-2002; Wagner, K.U. et al. (1997) Nucleic Acids Res. 25:4323^330). Transformed ES cells 

are identified and microinjected into mouse cell blastocysts such as those from the C57BL/6 mouse 

strain. The blastocysts are surgically transferred to pseudopregnant dams, and the resulting chimeric 

progeny are genotyped and bred to produce heterozygous or homozygous strains. Transgenic aninaals 

thus generated may be tested with potential therapeutic or toxic agents. 

10 The dithp of the invention may also be manipulated in vitro in ES cells derived from human 

blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell 
lineages including endoderm, mesoderm, and ectodermal cell types. These cell lineages differentiate 
into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J,A. et al. 
(1998) Science 282:1145-1147). 

15 The dithp of the invention can also be used to create "knockin** humanized animals (pigs) or 

transgenic animals (mice or rats) to model human disease. With knockin technology, a region of 
dithp is injected into animal ES cells, and the injected sequence integrates into the animal cell 
genome. Transformed cells are injected into blastulae, and the blastulae are implanted as described 
above. Transgenic progeny or inbred lines are studied and treated with potential pharmaceutical 

2 0 agents to obtain information on treatment of a human disease. Alternatively, a mammal inbred to 
overexpress dithp, resulting, e.g., iu the secretion of DITHP in its milk, may also serve as a 
convenient source of that protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74). 
Screening Assays 

DITHP encoded by polynucleotides of the present invention may be used to screen for 
25 molecules that bind to or are bound by the encoded polypeptides. The binding of the polypeptide and 
the molecule may activate (agonist), increase, inhibit (antagonist), or decrease activity of the 
polypeptide or the bound molecule. Examples of such molecules include antibodies, 
oUgonucleotides, proteins (e.g., receptors), or small molecules. 

Preferably, the molecule is closely related to the natural ligand of the polypeptide, e.g., a 
30 Ugand or fragment thereof, a natural substrate, or a structural or functional mimetic. (See, Coligan et 
al., (1991) Current Protocols m Immunoiogv 1(2): Chapter 5.) Similarly, the molecule can be closely 
related to the natural receptor to which the polypeptide binds, or to at least a fragment of the receptor, 
e.g., the active site. In either case, the molecule can be rationally designed using known techniques. 
Preferably, the screening for these molecules involves producing appropriate cells which express the 
35 polypeptide, either as a secreted protein or on the cell membrane. Preferred cells include cells from 
mammals, yeast, Drosophila . or E. coli . Cells expressing the polypeptide or cell membrane fractions 
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industrial and naturally-occuning environmental compounds. All compounds induce characteristic 

gene expression patterns, frequently termed molecular fingerprints or toxicant signatures, which are 

indicative of mechanisms of action and toxicity (Nuwaysir, E. F. et al. (1999) Mol. Carcmog. 24:153- 

159; Steiner, S. and Anderson, N.L. (2000) Toxicol. Lett 112-113:467-71, expressly incorporated by 

5 reference herein). If a test compound has a signature similar to that of a compound with known 

toxicity, it is likely to share those toxic properties. These fingerprints or signatures are most usefiil 

and refined when they contain expression information from a large number of genes and gene 

families. Ideally, a genome-wide measurement of expression provides the highest quality signature. 

Even genes whose expression is not altered by any tested compounds are important as well, as the 

10 levels of expression of these genes are used to normalize the rest of the expression data. The 

normalization procedure is useful for comparison of expression data after treatment with different 
compounds. While the assignment of gene function to elements of a toxicant signature aids in 
interpretation of toxicity mechanisms, knowledge of gene function is not necessary for the statistical 
matching of signatures which leads to prediction of toxicity. (See, for example, Press Release 00-02 

15 from the National Institute of Environmental Health Sciences, released February 29, 2000, available 
at niehs.nih.gov/oc/news/toxchip.htm-) Therefore, it is important and desirable in toxicological 
screening using toxicant signatures to include all expressed gene sequences. 

In one embodiment, the toxicity of a test compound is assessed by treating a biological 
sample containing nucleic acids with the test compound. Nucleic acids that are expressed in the 

20 treated biological sample are hybridized with one or more probes specific to the polynucleotides of 
the present invention, so that transcript levels corresponding to the polynucleotides of the present 
invention may be quantified. The transcript levels in the treated biological sample are compared with 
levels in an untreated biological sample. Differences in the transcript levels between the two samples 
are indicative of a toxic response caused by the test compound in the treated sample. 

25 Another particular embodiment relates to the use of DITHP encoded by polynucleotides of 

the present invention to analyze the proteome of a tissue or cell type. The term proteome refers to the 
global pattern of protein expression in a particular tissue or cell type. Each protein component of a 
proteome can be subjected individually to further analysis. Proteome expression patterns, or profiles, 
are analyzed by quantifying the number of expressed proteins and their relative abundance under 

30 given conditions and at a given time. A profile of a cell's proteome may thus be generated by 

separating and analyzing the polypeptides of a particular tissue or cell type. In one embodiment, the 
separation is achieved using two-dimensional gel electrophoresis, in which proteins from a sample are 
separated by isoelectric focusing in the first dimension, and then according to molecular weight by 
sodium dodecyl sulfate slab gel electrophoresis in the second dimension (Sterner and Anderson, 

35 supra) . The proteins are visualized in the gel as discrete and uniquely positioned spots, typically by 
staining the gel with an agent such as Coomassie Blue or silver or fluorescent stains. The optical 
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The amount of protein recognized by the antibodies is quantified. The amount oi proiem m ine 

treated biological sanq)le is compared with the amount in an untreated biological sample. A 

difference in the amount of protein between the two san:q)les is indicative of a toxic response to the 

test confound in the treated san[q>le. 

5 Transcript images may be used to profde dithp expression in distinct tissue types. This 

process can be used to determine human molecule activity in a particular tissue type relative to this 

activity in a different tissue type. Transcript images may be used to generate a profile of dithp 

expression characteristic of diseased tissue. Transcript images of tissues before and after treatment 

may be used for diagnostic purposes, to monitor the progression of disease, and to monitor the 

10 . efficacy of drug treatments for diseases which affect the activity of human molecules. 

Transcript images of cell lines can be used to assess human molecule activity and/or to 
identify cell lines that lack or misregulate this activity. Such cell lines may then be treated with 
pharmaceutical agents, and a transcript image following treatment may indicate the efficacy of these 
agents in restoring desired levels of this activity. A similar approach may be used to assess the 

15 toxicity of pharmaceutical agents as reflected by undesirable changes in human molecule activity. 
Candidate pharmaceutical agents may be evaluated by comparing their associated transcript images 
with those of pharmaceutical agents of known effectiveness. 
Antisense Molecules 

The polynucleotides of the present invention are useful in antisense technology, Antisense 

20 technology or therapy relies on the modulation of expression of a target protein through the specific 
binding of an antisense sequence to a target sequence encoding the target protein or directing its 
expression. (See, e.g., Agrawal, S., ed. (1996) Antisense Therapeutics , Humana Press Inc., Totawa 
NJ; Alama, A. et al. (1997) Pharmacol. Res. 36(3): 171-178; Crooke, S.T. (1997) Adv. Pharmacol. 
40:1-49; Sharma, H.W. and R. Narayanan (1995) Bioessays 17(12): 1055-1063; and Lavrosky, Y. et 

25 al. (1997) Biochem. Mol. Med, 62(l):ll-22.) An antisense sequence is a polynucleotide sequence 
capable of specifically hybridizing to at least a portion of the target sequence. Antisense sequences 
bind to cellular roRNA and/or genomic DNA, affecting translation and/or transcription. Antisense 
sequences can be DNA, RNA, or nucleic acid mimics and analogs. (See, e.g., Rossi, J.J. et al. (1991) 
Antisense Res. Dev. l(3):285-288; Lee, R. et al. (1998) Biochemistry 37(3):900-1010; Pardridge, 

30 W.M. et al. (1995) Proc. Natl. Acad. Sci. USA 92(12):5592-5596; and Nielsen, P. E. and Haaima, G. 
(1997) Chem Soc. Rev. 96:73-78.) Typically, the binding which results in modulation of expression 
occurs through hybridization or binding of complementary base pairs. Antisense sequences can also 
bind to DNA duplexes through specific interactions in the major groove of the double helix. 

The polynucleotides of the present invention and firagments thereof can be used as antisense 

3 5 sequences to modify the expression of the polypeptide encoded by dithp. The antisense sequences 
can be produced ex vivo , such as by using any of the ABI nucleic acid synthesizer series (Applied 
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the sequence of interest. (See, e.g., Agrawal. sm^) 
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^(n,.™^,ceU.,.«™. (S».e.g.,San*roofc«Ans.W 1995.1^ V^l^ke, 0^ 
r.,lschu«e.a9,9,,.Blo,.Che.«4:5503-SmB..,er.O.«.M..^^^^^^^ 
153,516-^300,=,. C.Ae....a994)Bl^.chnologyl2:1814M;Bngema,d^E.K.«a^^^^^ 
„ Proc NalAcad.Sd.tISA91:3224.32Z;;Sa»Bg.V.=.aL(1996)Hnn,.0»teTher.7..9 74^^^^ 
35 Froc. JNau. /vcaa. o ^ . ^ » /i QRd-> KN4B0 J. 3: 1671-1680; Broghe, 

Takamatsu, N. (1987) BMBO J. 6:307-311; Coruzzi, G. et al. (1984) EMBO 
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R. et 2u. ^L^o'^j ouicnce 224:5io-843; Winter, J. et al. (1991) Results Probl. Cea jl'uici. i /;oj-iuj; 
The McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, New York NY, pp. 
191-196; Logan, J. and T. Shenk (1984) Proc. Natl. Acad. ScL USA 81:3655-3659; and Harrington, 
J. J. et al. (1997) Nat Genet 15:345-355.) Expression vectors derived from retroviruses, 
5 adenoviruses, or herpes or vaccinia viruses, or from various bacterial plasmids, may be used for 
delivery of nucleotide sequences to the targeted organ, tissue, or cell population. (See, e.g., Di 
Nicola, M. et al. (1998) Cancer Gen. Ther. 5(6):350-356; Yu, M. et al., (1993) Proc. Natl Acad. Sci. 
USA 90(13):6340-6344; Buller, R.M. et al. (1985) Nature 317(6040):813-815; McGregor, D.P. et al. 
(1994) MoL Immunol. 31(3):219-226; and Verma, LM. and N. Somia (1997) Nature 389:239-242.) 

10 The invention is not limited by the host cell employed. 

For long term production of recombinant proteins in mammalian systems, stable expression 
of DITHP in cell lines is preferred. For example, sequences encoding DITHP can be transformed into 
cell lines using expression vectors which may contain viral origins of replication and/or endogenous 
expression elements and a selectable marker gene on the same or on a separate vector. Any number 

15 of selection systems may be used to recover transformed cell lines. (See, e.g., Wigler, M. et al. 

(1977) Cell 11:223-232; Lowy, I. et al. (1980) Cell 22:817-823.; Wigler, M. et al. (1980) Proc, Natl. 
Acad. Sci. USA 77:3567-3570; Colbere-Garapin, R et al. (1981) J. Mol. Biol. 150:1-14; Hartman, 
S.C. and R.C.MulUgan (1988) Proc. Natl. Acad. Sci, USA 85:8047-8051; Rhodes, C.A. (1995) 
Methods Mol. Biol. 55:121-131.) 

20 Therapeutic Uses of dithp 

The dithp of the invention may be used for somatic or germline gene therapy. Gene therapy 
may be performed to (i) correct a genetic deficiency (e.g., in the cases of severe combined 
immunodeficiency (SCID)-Xl disease characterized by X-linked inheritance (Cavazzana-Calvo, M. et 
al. (2000) Science 288:669-672), severe combined immunodeficiency syndrome associated with an 

25 inherited adenosine deaminase (ADA) deficiency OBlaese, R.M. et al. (1995) Science 270:475-480; 
Bordignon, C. et al. (1995) Science 270:470-475). cystic fibrosis (Zabner, J. et al. (1993) CeU 75:207- 
216; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:643-666; Crystal, R.G. et al. (1995) Hum. 
Gene Therapy 6:667-703), thalassemias, familial hypercholesterolemia, and hemophilia resulting 
from Factor Vm or Factor DC deficiencies (Crystal, R.G- (1995) Science 270:404-410; Veima, LM. 

30 and Somia, N. (1997) Nature 389:239-242)), (ii) express a conditionally lethal gene product (e.g., m 
the case of cancers which result from unregulated cell proliferation), or (iii) express a protein which 
affords protection against intracellular parasites (e.g., against human retroviruses, such as human 
immunodeficiency virus (HIV) (Baltimore. D. (1988) Nature 335:395-396; Poeschla, E. et al. (1996) 
Proc. Natl. Acad. Sci. USA. 93:11395-11399), hepatitis B or C vmis (HBV, HCV); fimgal parasites, 

3 5 such as Candida albicans and Paracoccidioides brasilienRis : and protozoan parasites such as 

Plasmodium falcipannn and Trypanosoma cruzi) . In the case where a genetic deficiency m dithp 

151 



ithparetreaedByoo . „,,,HvAtot Mite. McchiiiifaatrmsferBchnolog.es for 

s a,e5evectoisby.MCtoifcdllieai!iii»atllIHWto«"«'» ■ ^- „, (ii) 

„„,^*cemi.vi,oora^tacWcO)di.«»X'NA-^=«'--°"-°*"*^^^^ 

Blcc.»..62a91-n7.,Ivic.ia9»7,CcU91:50.-510;Boda,.^U.«dRec,pon.Ha99S,C»^^ 
««.»;*ePCDNA.l,BrrrAG.PRCCMV.,PKEP.PVAXv«rsO.^^^-'»^CA), 

n?<5\A SV40 virus, thymidme kinase (TK), or p-actin genes,;, ; 
Rous sarcoma virus J, jr , j xj nQQ^^Proc Natl. 

^ Soi. V.S.A. 89,5547-5551; Gossen. M. e, al.. (.995) Science 268.1766- 7^. R<»si.F 
»aBUn,H^(199S)C«rr.Opin.B— 9:451.56,.coni».*..--^^ 
. pU-~n,..ec.,sone-.ancMe.r„„o„rC„.^^^^^ 

PBTO- bvtogen); tie FK506/r.paiii,cin indncible promoter, or the WJimrnKp 

^<.olJp.^.v.„eB.n,aM.^.»(m).«ss.^'==P--'«-■■-' 

p^o„r„Cne.„do.e„ousgene»c^insDrmP^nno™«lindi,^ 

Co™ercid,yav.ilab.e„posoi«.,ansfo.™donla.s(.^.thePERFECTI«D 
TRANSPBCI,ONKir.av,U.*^m— )aHowo»»i.l.=.d™.,slaUi.the^deUve. 

loZ>e«>-es.o..rgetce,.^c.>«e..d,.,.i.= -.n-.««»«^-^=^^^^ 
r.l.e,a.M,bea>.erna,ive,^fbr.-ontep»f<««d,i^t.»calcln.np.^ 

Graham F.L.andEb,>.l.<1973)VWc^ 52:456^7). orbyeUctropora^CN^™^.^^^ 

X^O. 1:841-845). Tbe in^^on of DNA to pri^r, c^s r.,»be, ^idiSc.,.. of 

30 aiesestandardi«dn»n-i»l.r«i.ftcaonp«>tooo.s. 

Inano.here.nbodm««otthemven.ion,d.se..eso,dBordersc»is«ibyg 
^..o.t,.ex^c««.,.e»..b,c»«nc,ing.^vo««.cons.»go^ 

^con^ofan^end.n.P-""'-'"-^'-'^;^^""^^ 
ap,ropri«e KNAp»±MPng ignah, »Ki (m) . R^-responsive elen«i. (RBE) along with 

„ ^Lci^RNAse^andcodingse,.«.c.sre,niredfo.efncien.,ectorpropaga„». 
" ^v.c.o.L.™"»^™,.econ™ci.U,.v*.e(S.ratage»^ 



152 



pubuffirr(M^IIre, L et al. (1995) Proc. Natl. Acad. Sci. U.S.A. 92:6733-67?OT§3»?l?l^ 
reference herein. The vector is propagated in an appropriate vector producing cell line (VPCL) that 
expresses an envelope gene with a tropism for receptors on the target cells or a promiscuous envelope 
protein such as VSVg (Armentano, D. et al. (1987) J. Virol. 61:1647-1650; Bender, M.A. et al. 
(1987) J. Virol. 61:1639-1646; Adam, M.A. and Miller, A.D. (1988) J. Virol. 62:3802-3806; Dull, T. 
et al. (1998) J. Virol. 72:8463-8471; Zufferey, R. et al. (1998) J. Virol. 72:9873-9880). U.S. Patent 
No. 5,910,434 to Rigg ("Method for obtaining retrovirus packaging cell lines producing high 
transducmg efficiency retroviral supernatant") discloses a method for obtaining retrovirus packaging 
cell lines and is hereby incorporated by reference. Propagation of retrovims vectors, transduction of 
a population of cells (e,g., CD4+ T-cells), and die return of transduced cells to a patient are 
procedures well known to persons skilled in the art of gene therapy and have been well documented 
Olanga, U. et al. (1997) J. Virol. 71:7020-7029; Bauer, G. et al. (1997) Blood 89:2259-2267; 
Bonyhadi, M,L. (1997) J. Virol. 71:4707-4716; Ranga, U. et al. (1998) Proc. Natl. Acad. Sci. U.S.A. 
95:1201-1206; Su, L. (1997) Blood 89:2283-2290). 

In the alternative, an adenovirus-based gene therapy delivery system is used to deliver dithp 
to ceUs which have one or more genetic abnormalities with respect to the expression of dithp. The 
construction and packagmg of adenovirus-based vectors are well known to those with ordinary skill 
in the art. Replication defective adenovirus vectors have proven to be versatile for importing genes 
encoding immunoregulatory proteins into intact islets in the pancreas (Csete, M.E. et al. (1995) 
Transplantation 27:263-268). Potentially useful adenoviral vectors are described in U.S. Patent No. 
5,707,618 to Armentano ("Adenovirus vectors for gene therapy"), hereby incorporated by reference. 
For adenoviral vectors, see also Antinozzi, P.A. et al. (1999) Annu. Rev. Nutr. 19:511-544 and 
Verma, LM. and Somia, N. (1997) Nature 18:389:239-242, both incorporated by reference herein. 

In another alternative, a herpes-based, gene therapy delivery system is used to deliver dithp to 
target cells which have one or more genetic abnormalities with respect to the expression of dithp. 
The use of herpes simplex virus (HSV)-based vectors may be especiaUy valuable for introducing 
dithp to cells of the central nervous system, for which HSV has a tropism. The constmction and 
packaging of herpes-based vectors are well known to those with ordinary skill in the art. A 
replication-competent herpes simplex vmis (HSV) type 1-based vector has been used to deliver a 
reporter gene to the eyes of primates (Liu, X. et al. (1999) Exp. Eye Res. 169:385-395). The 
construction of a HSV-1 virus vector has also been disclosed in detail in U.S. Patent No. 5,804,413 to 
DeLuca ("Herpes simplex virus strains for gene transfer"), which is hereby incorporated by reference. 
U.S. Patent No. 5,804,413 teaches the use of recombmant HSV d92 which consists of a genome 
containing at least one exogenous gene to be transferred to a cell under the control of the appropriate 
promoter for purposes including human gene therapy. Also taught by this patent are the construction 
and use of recombinant HSV strains deleted for ICP4, ICP27 and ICP22. For HSV vectors, see also 
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at least live aimno acids, preferably at least 10 amino acids, and most preterably ul loaou annxiw 

acids. A peptide which mimics an antigenic fragment of the natural polypeptide may be fused with 

another protein such as keyhole lirapet hemocyanin (KLH; Sigma, St. Louis MO) for antibody 

production. A peptide encompassing an antigenic region may be expressed from a dithp, synthesized 

as described above, or purified from human cells. 

Procedures well known in the art may be used for the production of antibodies. Various hosts 
including mice, goats, and rabbits, may be unmunized by mjection with a peptide. Depending on the 
host species, various adjuvants naay be used to increase immunological response. 

In one procedure, peptides about 15 residues m length may be synthesized using an ABI 
43 lA peptide synthesizer (Applied Biosystems) using fmoc-chemistry and coupled to KLH (Sigma) 
by reaction with M-maleimidobenzoyl-N-hydroxysuccinimide ester (Ausubel, 1995, supra). Rabbits 
are immunized with the peptide-KLH complex in complete Freund's adjuvant. The resulting antisera 
are tested for antipeptide activity by binding the peptide to plastic, blocking with 1% bovine serum 
albumin (BSA), reacting with rabbit antisera, washing, and reacting with radioiodinated goat anti- 
rabbit IgG. Antisera with antipeptide activity are tested for anti-DUHP activity using protocols well 
known in the art, including ELISA, radioimmunoassay (RIA), and immunoblotting. 

In another procedure, isolated and purified peptide may be used to immunize mice (about 100 
MS of peptide) or rabbits (about 1 mg of peptide). Subsequently, the peptide is radioiodmated and 
used to screen the immunized animals' B-lymphocytes for production of antipeptide antibodies. 
Positive cells are then used to produce hybridomas using standard techniques. About 20 mg of 
peptide is sufficient for labeling and screening several thousand clones. Hybridomas of interest are 
detected by screening with radioiodinated peptide to identify those fusions producing peptide-specific 
monoclonal antibody. In a typical protocol, wells of a multi-well plate (FAST, Becton-Dickinson, 
Palo Alto, CA) are coated with affmity-purified, specific rabbit-anti-mouse (or suitable anti-species 
IgG) antibodies at 10 mg/ml. The coated wells are blocked with 1% BSA and washed and exposed to 
supematants from hybridomas. After incubation, the wells are exposed to radiolabeled peptide at 1 
mg/ml. 

Clones producing antibodies bind a quantity of labeled peptide that is detectable above 
background. Such clones are expanded and subjected to 2 cycles of cloning. Qoned hybridomas are 
injected into pristane-treated mice to produce ascites, and monoclonal antibody is purified from the 
ascitic fluid by affmity chromatography on protein A (Amersham Pharmacia Biotech). Several 
procedures for the production of monoclonal antibodies, mcluding in vitro production, are described 
in Pound (supra). Monoclonal antibodies with antipeptide activity are tested for anti-DITHP activity 
using protocols well known in the art, including ELISA, RIA, and immunoblotting. 

Antibody fragments containing specific binding sites for an epitope may also be generated. 
For example, such fragments include, but are not limited to, the F(ab')2 fragments produced by pepsm 
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Protocols for detecting and measuring protein expression using either polyclonal or 
^noclonalantibodiesarewellUnownintheart. Examples include BLISA, RIA, and ™ 
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Tl.e disclosures of all patents, applications, and publications mentioned above and below. 
includingU.S.Ser. NO. 60/410.260 andU.S.Ser.No.60/410,m are hereby expressly m^^^^ 

by reference. 
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EXAMPLES 

I. Construction of cDNAUbraries 

RNA was purchased from vendors (for example, CLONIECH l^oratorxes. Inc. (Palo Alto 
CA)) providedbyclientsorcollaborators.orisolatedfi:omvarioustissues. Some tissues were 
homogenized and lysedinguanidiniumisotMocyanate. while others werehomogenizedand 

phenol orinasuitabler^tureofdenaturants,suchasT^L(LifeTechnologi^ 
solution of phenol and guanidineisothiocyanate-mresultinglysates were centrrfu^^ 
cushions orextractedwithchlorofom. RNA was precipitated with either isopropanol or sodmm 

acetate and ethanol. or by other routine methods. 

J ^■.^i*^Hn.r^ nf RNA weTB repeated as necessary to mcrease KIN A 
Phenol extraction and precipitation ot KM A were reijc 
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OLIGOTEX latex particles (QIAGEN, Inc. (QIAGEN), Valencia CA), or an OLxuv^ic^ niiuN/\ 

purification kit (QIAGEN). Alternatively. RNA was isolated directly from tissue lysates using otiier 

RNA isolation kits, e.g.i flie POLY(A)PURE mRNA purification kit (Ambion, Inc., Austin TX). 

In some cases, Stratagene was provided with RNA and constructed tiie corresponding cDNA 
libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed with the UNIZAP 
vector system (Stratagene Qoning Systems, Inc. (Stratagene), La Jolla CA) or SUPERSCRIPT 
plasmid system (Life Technologies), using the recommended procedures or similar methods known in 
the art. (See, e.g., Ausubel, 1997. supra. Chapters 5.1 through 6.6.) Reverse transcription was 
ioitiated using oligo d(T) or random primers. Synthetic oligonucleotide adapters were ligated to 
double stranded cDNA, and the cDNA was digested with the appropriate restriction enzyme or 
enzymes. For most libraries, tiie cDNA was size-selected(^300 bp) usiag SEPHACRYL SIOOO, 
SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography (Amersham Pharmacia 
Biotech) or preparative agarose gel electrophoresis, cDNAs were ligated into compatible restriction 
enzyme sites of the polylinker of a suitable plasmid, e.g.. PBLUESCRIPT plasmid (Stratagene), 
PSPORTl plasmid (Life Technologies), PCDNA2.1 plasmid (Invitrogen, Carlsbad CA), PBK-CMV 
plasmid (Stratagene), PCR2-T0P0TA plasmid (hivitrogen), PCMV-IQS plasmid (Stratagene), 
pIGEN (Incyte), pRARE (Incyte), or pDSTCY (Incyte), or derivatives thereof Recombmant plasmids 
were transformed into competent E. coH cells including XLl-Blue, XLl-BlueMRF. or SOLR from 
Stratagene or DH5a, DHIOB, or ElectroMAX DHIOB from Life Technologies. 

Alternatively, multiple clones needing complete insert sequencing are pooled into 'shot-gun* 
libraries. The cDNA inserts from these pools are an^)lified by PCR, mechanically sheared into 
smaller pieces, and cloned into plasmid vectors for vector primer sequencing. Assembly of the 
nucleic acid sequences of the small pieces into their respective parent full-length inserts can then be 
accomplished using sequence assembly programs such as PHRAP (phrap.org/phrap.docs/phrap.html). 
n. Isolation of cDNA Qones 

Plasmids were recovered from host cells by in vivo excision usmg the UNIZAP vector system 
(Stratagene) or by cell lysis. Plasmids were purified using at least one of the following: the Magic or 
WIZARD Minipreps DNA purification system (Promega); the AGTC Miniprep purification kit (Edge 
BioSystems, Gaithersburg MD); and the QIAWELL 8, QIAWELL 8 Plus, and QIAWELL 8 Ultra 
*plasmid purification systems or the R.E. A.L. PREP 96 plasmid purification kit (QIAGEN). 
Following precipitation, plasmids were resuspended in 0.1 ml of distilled water and stored, with or 
without lyophilization, at 4''C. 

Altematively, plasmid DNA was amplified from host cell lysates using direct link PCR in a 
high-throughput format. (Rao, V.B. (1994) Anal. Biochem. 216:1-14.) Host cell lysis and fliermal 
cycling steps were carried out in a single reaction mixture. Samples were processed and stored in 
384-well plates, and the concentration of amplified plasmid DNA was quantified fluorometrically 
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spurious IBatches. 

Processed sequences were then subject to assembly procedures m which the sequences were 
assigned to gene bins (bins). Each sequence could only belong to one bin. Sequences in each gene 
bin were assembled to produce consensus sequences (templates). Subsequent new sequences were 
added to existing bms using BLASTN (v. 1.4 WashU) and CROSSMATCH. Candidate pairs were 
identified as all BLAST hits having a quality score greater than or equal to 150. Alignments of at 
least 82% local identity were accepted into the bin. The component sequences from each bin were 
assembled using a version of PHRAP. Bins with several overlappmg component sequences were 
assembled usmg DEEP PHRAP. The orientation (sense or antisense) of each assembled template was 
determmed based on the number and orientation of its component sequences. Template sequences 
disclosed in the sequence listmg correspond to sense strand sequences (the "forward" reading 
frames), to the best determination. The complementary (antisense) strands are inherently disclosed 
herein. The component sequences which were used to assemble each template consensus sequence 
are listed in Table 5 of U.S. Ser. No. 60/410,260 and U.S. Ser. No. 60/410,259, along with their 
positions along the template nucleotide sequences, and are hereby expressly incorporated by 
reference. 

Bins were compared against each other and those having local similarity of at least 82% were 
combined and reassembled. Reassembled bins having templates of insufficient overlap (less than 
95% local identity) were re-split. Assembled templates were also subject to analysis by 
STrrCHER/EXON MAPPER algorithms which analyze the probabilities of the presence of splice 
variants, alternatively spliced exons, splice junctions, differential expression of alternative spliced 
genes across tissue types or disease states, etc. These resulting bins were subject to several rounds of 
the above assembly procedures. 

Once gene bins were generated based upon sequence alignments, bins were clone joined 
based upon clone information. If the 5' sequence of one clone was present in one bin and the 3' 
sequence from the same clone was present in a different bin, it was likely that the two bins actually 
belonged together in a single bin. The resulting combined bins underwent assembly procedures to 
regenerate the consensus sequences. 

The final assembled templates were subsequentiy annotated using the foUowmg procedure. 
Template sequences were analyzed using BLASTN (v2.0, NCBD versus gbpri (GenBank version 
130). "Hits" were defined as an exact match having from 95% local identity over 200 base pairs 
through 100% local identity over 100 base pairs, or a homolog match havmg an E-value, i.e. an 
expected by chance value, of ^ 1 x 10"^ The hits were subject to frameshift FASTx versus GENPEPT 
(GenBank version 130). (See Table 6). In this analysis, a homolog match was defined as havmg an 
E-value of ^ 1 x 10*^. The assembly method used above was described in "System and Methods for 
Analyzing Biomolecular Sequences," U.S.S.N. 09/276,534, filed March 25, 1999, and the LIFESEQ 
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seqiiSiicb^u^y^u^l^ queried against public databases such as the GenBanlf 5tI(E§,2225{2^x^, 
vertebrate, prokaiyote, and eukaryote databases. Polynucleotide and polypeptide sequence 
alignments are generated using default parameters specified by the Q.USTAL algorithm as 
mcoiporated into the MEGALIGN multisequence alignment program (DNASTAR), which also 
5 calculates the percent identity between aligned sequences. 

An alternative analysis of genomic transcripts involves using the annotation resulting from a 
homology search using the BLAST algorithms (v 2.0, NCBI) against both public (GenBank, NCBI) 
and internal OJFESEQ, LIFESEQ FOUNDATION, Incyte) databases. The BLASTX algorithm was 
used to compare partial transcripts to SwissProt (version 40.22) and GenPept (NCBI, version 

10 130„20020629) and Proteome BioKnowledge Library (BKL v. 020612) databases. Additionally, 
BLASTN was used to compare the genomic transcripts to the primate division sequence database of 
GenBank, (gbpri , version 130^20020629) with sequences greater than 50 Kb being removed. 
Homologs were identified as those sequences having a BLAST E-value (expect value, the number of 
a matches with the same quality expected purely by chance) of ^ 1 x 10'^ for BLASTX comparisons 

15 and 1 X 10"^ for BLASTN comparison. The five homologs having the lowest E-values were saved 
from each database comparison. An additional five homologs having the lowest E-values obtained 
were saved when comparisons were made against the Protein Data Bank (version 20020624, Berman, 
H.M. et al. (2000) Nucleic Acids Res, 28:235-242). The Protein Data Bank can provide putative 
structm-al information for the transcripts.' 

20 The cDNA transcripts identified from genomic contigs were also analyzed by translating each 

transcript in the three forward reading frames and searching each resulting translation agamst the 
Pfam database of hidden Markov model (HMM) protein families and domains (Pfam version 7.2). 
The HMM search algorithm is performed using TimeLogic DECYPHER (version 7.0.0.34) 
(TimeLogic Corp., Crystal Bay, NV) and Daemon (version 7.2.3.867, TimeLogic). The Pfam 

2 5 comparison scores which were better than the global cutoff are assigned to the respective HMMs 
reported. 

The resulting transcript annotation was categorized based on Protein Function Hierarchy 
(PFH) classification and Gene Ontology (GO, The Gene Ontology Consortium, (2000) Nature Genet. 
25:25-29). PFH assignments were determined based on keywords of a representative homolog. The 

30 representative homolog was determined by comparing the BLAST homology data of the top five 
homologs from SwissProt, GenPept and gbpri databases. For each transcript, the gbpri hit was 
multiplied by a scaling factor of 0.26 to make it comparable to BLASTX protein scores. Homologs 
which were not within 20% of the greatest BLAST score, which is given as the homolog* s greatest 
BLAST score local alternative alignment (HSP), were removed and not used in further analyses. 

35 From the remaining homologs, the greatest SwissProt hit was selected as the representative homolog, 
when available; otherwise the greatest scoring GenPept hit was chosen as the representative homolog. 
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dmvedliomi^cUN^ library constructed from a human tissue. Each human tis^c L^misSJcSJSuZ 
one of the following organ/tissue categories: cardiovascular system; connective tissue; digestive 
system; embryonic struchires; endocrine system; exocrine glands; genitalia, female; genitalia, male; 
germ cells; hemic and unmune system; liver; musculoskeletal system; nervous system; pancreas; 
respiratory system; sense organs; skin; stomatognafliic system; unclassified/mixed; or urinary tract 
The number of libraries in each category for each polynucleotide sequence encodmg DITHP is 
counted and divided by the total number of libraries across all categories for each polynucleotide 
sequence encodmg DITHP. Similarly, each human tissue is classified into one of the following 
disease/condition categories: cancer, cell line, developmental, inflammation, neurological, trauma, 
cardiovascular, pooled, and other, and the number of libraries in each category for each 
polynucleotide sequence encoding DITHP is counted and divided by the total number of libraries 
across all categories for each polynucleotide sequence encoding DITHP. The resulting percentages 
reflect the tissue-specific and disease-specific expression of cDNA encoding DITHP. Percentage 
values of tissue-specific expression are reported in Table 5. cDNA sequences and cDNA 
library/tissue information are found in the LDFESEQ database (Incyte). 
VI. Tissue Distribution Profiling 

A tissue distribution profile is determined for each template by compiling the cDNA library 
tissue classifications of its component cDNA sequences. Each component sequence, is derived from 
a cDNA library constructed firom a human tissue. Each human tissue is classified into one of the 
following categories: cardiovascular system; connective tissue; digestive system; embryonic 
structures; endocrine system; exocrine glands; genitalia, female; genitalia, male; germ cells; hemic 
and immune system; liver; musculoskeletal system; nervous system; pancreas; respiratory system; 
sense organs; skin; stomatognathic system; unclassified/mixed; or urinary tract. Template sequences, 
component sequences, and cDNA library/tissue information are found in the LIFESEQ database 
(Incyte). 

Table 5 shows the tissue distribution profile for the templates of the invention. For each 
template, the three most firequentiy observed tissue categories are shown in colunni 3, along with the 
percentage of component sequences belonging to each category. Only tissue categories with 
percentage values of ^ 10% are shown. A tissue distribution of "widely distributed" in column 3 
indicates percentage values of <10% in all tissue categories. 
Vn, Transcript Image Analysis 

Transcript images are generated as described in Seilhamer et al., "Comparative Gene 
Transcript Analysis," U.S. Patent No. 5,840,484, mcorporated herein by reference. 
Vin, Extension of Polynucleotide Sequences and Isolation of a Full-length cDNA 

Oligonucleotide primers designed using a dithp of the Sequence Listmg are used to extend 
the nucleic acid sequence. One primer is synthesized to initiate 5' extension of the template, and the 
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antibiotic-coniammg media, mdividual colonies are picked and cultured ovemiguL at j / m 

weU plates in LB/2x carbenicillin liquid media. 

The cells are lysed, and DNA is amplified by PGR using Taq DNA polymerase (Amersham 

Pharmacia Biotech) and Pfu DNA polymerase (Stratagene) with the following parameters: Step 1: 

5 94*^0, 3 min; Step 2: 94°C, 15 sec; Step 3: 60*^0, 1 min; Step 4: 72°C, 2 min; Step 5: steps 2, 3, and 4 

repeated 29 times; Step 6: 72°C, 5 min; Step 7: storage at 4X. DNA is quantified by PICOGREEN 

reagent (Molecular Probes) as described above. Samples with low DNA recoveries are reamplified 

using the same conditions as described above. Samples are diluted widi 20% dimethysulfoxide (1:2, 

v/v), and sequenced using DYENAMIC energy transfer sequencing primers and the DYENAMIC 

10 DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM BIGDYE Terminator cycle 
sequencing ready reaction kit (Applied Biosystems). 

In like manner, the dithp is used to obtain regulatory sequences (promoters, introns, and 
enhancers) using the procedure above, oligonucleotides designed for such extension, and an 
appropriate genomic library. 

15 IX. Labeling of Probes and Southern Hybridization Analyses 

Hybridization probes derived from the dithp of the Sequence Listing are employed for 
screening cDNAs, mRNAs, or genomic DNA. The labeling of probe nucleotides between 100 and 
1000 nucleotides in length is specifically described, but essentially the same procedure may be used 
with larger cDNA fragments. Probe sequences are labeled at room temperature for 30 minutes using 

20 a T4 polynucleotide kinase, -y^^P-ATP, and 0.5X One-Phor-All Plus (Amersham Pharmacia Biotech) 
buffer and purified using a ProbeQuant G-50 Microcolumn (Amersham Pharmacia Biotech). The 
probe mixture is diluted to W dpm//xg/ml hybridization buffer and iised m a typical membrane-based 
hybridization analysis. 

The DNA is digested with a restriction endonuclease such as Eco RV and is electrophoresed 
25 through a 0.7% agarose gel. The DNA fragments are transferred from the agarose to nylon membrane 
(NYTRAN Plus, Schleicher & Schuell, Lie, Keene NH) using procedures specified by the 
manufacturer of the membrane. Prehybridization is carried out for three or more hours at 68 °C, and 
hybridization is carried out overnight at 68 °C. To remove non-specific signals, blots are sequentially 
washed at room temperature under mcreasingly stringent conditions, up to O.lx saline sodium citrate 
3 0 (SSC) and 0.5% sodium dodecyl sulfate. After the blots are placed in a PHOSPHORMAGER 
cassette (Molecular Dynamics) or are exposed to autoradiography fihn, hybridization patterns of 
standard and experimental lanes are compared. Essentially the same procedure is employed when 
screening RNA. 

X. Chromosome Mapping of dithp 
3 5 The cDNA sequences which were used to assemble SEQ ID NO: 1-2722 are compared with 

sequences firom the Incyte LBFESEQ database and public domam databases using BLAST and other 
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i^i&32Jlf the present invention are used to generate array elenients ^£1^ §A?.^3i?M?JJ^ 
is amplified from bacterial cells containing vectors with cloned cDNA inserts. PCR ani^lification 
uses primers complementary to the vector sequences flanking the cDNA insert. Array elements are 
amplified in thirty cycles of PCR fi-om an initial quantity of 1-2 ng to a final quantity greater than 5 
5 iig. Amplified array elements are then purified using SEPHACRYL-400 (Amersham Pharmacia 
Biotech). 

Purified array elements are immobilized on polymer-coated glass slides. Glass microscope 
slides (Coming) are cleaned by ultrasound in 0.1% SDS and acetone, with extensive distilled water 
washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR 
10 Scientific Products Corporation (VWR), West Chester, PA), washed extensively in distilled water, 
and coated with 0.05% aminopropyl silane (Sigma) in 95% ethanol. Coated slides are cured in a 
110°C oven. 

Array elements are applied to the coated glass substrate using a procedure described in US 

Patent No. 5,807,522, incorporated herein by reference. 1 ^1 of the array element DNA, at an average 
15 concentration of 100 ng//xl, is loaded into the open capillary printing element by a high-speed robotic 

apparatus. The apparatus then deposits about 5 nl of array element sample per slide. 

Microarrays are UV-crossliiiked using a STRATALMKER UV-crosslinker (Stratagene). . 

Microarrays are washed at room temperature once in 0.2% SDS and three times in distilled water. 

Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
20 buffered saline (PBS) (Tropix, Inc., Bedford, MA) for 30 minutes at 60° C followed by washes in 

0.2% SDS and distilled water as before. 

Hybridization 

Hybridization reactions contain 9 /il of probe mixture consisting of 0.2 ptg each of Cy3 and 
Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The probe 

25 mixture is heated to 65® C for 5 minutes and is aliquoted onto the microarray surface and covered with 
an 1.8 cm^ coverslip. The arrays are transferred to a waterproof chamber having a cavity just slightly 
larger than a microscope slide. The chamber is kept at 100% humidity intemally by the addition of 
140 fil of 5x SSC in a comer of the chamber. The chamber containing the arrays is mcubated for 
about 6.5 hours at 60° C. The arrays are washed for 10 mm at 45° C in a first wash buffer (IX SSC, 

30 0.1% SDS), three times for 10 minutes each at 45°C in a second wash buffer (O.IX SSC), and dried. 
Detection 

Reporter-labeled hybridization con^>lexes are detected with a microscope equipped with an 
lonova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating spectral lines 
at 488 nm for excitation of CyS and at 632 nm for excitation of Cy5. The excitation laser light is 
35 focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY). The slide 
containing the array is placed on a computer-controlled X-Y stage on the microscope and raster- 
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plaJ^Sci^^^vl^^^^Ziu intestine (9%), spleen (3%), stomach (6%), testis (9%), 
Nonnal tissues from at least three different donors are assayed. RNA from each donor is separately 
isolated and is individually hybridized to a microarray. Because hybridization experiments are 
conducted using a common reference sample, differential expression values are directly comparable 
5 from one tissue to another. The resulting increase in expression by at least two-fold for any given 
RNA assayed indicates the usefulness of the sequence as a tisuue-specific marker for the tissue from 
which it was isolated. 
Xn. Complenoientary Nucleic Acids 

Sequences conaplementary to the dithp are used to detect, decrease, or inhibit expression of 

10 the naturally occurring nucleotide. The use of oUgonucleotides comprising from about 15 to 30 base 
pairs is typical in the art. However, smaller or larger sequence fragments can also be used. 
Appropriate oligonucleotides are designed from the dithp using OLIGO 4.06 software (National 
Biosciences) or other appropriate programs and are synthesized using methods standard in the art or 
ordered from a commercial suppHer. To inhibit transcription, a complementary oligonucleotide is 

15 designed from the most unique 5' sequence and used to prevent transcription factor binding to the 
promoter sequence. To inhibit translation, a complementary oUgonucleotide is designed to prevent 
ribosomal binding and processing of the transcript. 
Xni. Expression of DITHP 

Expression and purification of DITHP is accomplished using bacterial or virus-based 

20 expression systems. For expression of DITHP in bacteria, cDNA is subcloned into an appropriate 
vector containing an antibiotic resistance gene and an inducible promoter that directs high levels of 
cDNA transcription. Examples of such promoters include, but are not limited to, the trp-lac (tac) 
hybrid promoter and the T5 or T7 bacteriophage promoter in conjunction with the lac operator 
regulatory element. Recombinant vectors are transformed into suitable bacterial hosts, e.g., 

25 BL21(DE3). Antibiotic resistant bacteria express DITHP upon mduction with isopropyl beta-D- 
thiogalactopyranoside (IPTG). Expression of DITHP in eukaryotic cells is achieved by infecting 
insect or mamoialian cell lines with recombuiant Autoeraphica califomica nuclear polyhedrosis virus 
(AcMNPV), commonly known as baculovirus. The nonessential polyhedrin gene of baculovirus is 
replaced with cDNA encoding DITHP by either homologous recombination or bacterial-mediated 

30 transposition involving transfer plasmid mtermediates. Vural infectivity is mamtained and the strong 
polyhedrin promoter drives high levels of cDNA transcription. Recombinant baculovirus is used to 
infect Spodoptera frugiperda (S&) insect cells in most cases, or human hepatocytes, in some cases, 
lofection of the latter requires additional genetic modifications to baculovirus. (See e.g., Engelhard, 
supra : and Sandig, supra.) 

35 In most expression systems, DITHP is synthesized as a fusion protein with, e.g., glutathione 

S-transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step. 
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]9M^^^i^3xolase activity is measured by the hydrolysis of appropriate ?fJlSS2^^/PiS?^^ 
substrates conjugated with various chromogenic molecules in which the degree of hydrolysis is 
quantified by spectrophotometric (or fluorometric) absorption of the released chromophore. (Beynon, 
RJ. and J.S. Bond (1994) Proteolvtic Enzymes: A Practical Approach . Oxford University Press, New 
5 York NY, pp. 25-55) Peptide substrates are designed according to the category of protease activity as 
endopeptidase (serine, cysteine, aspartic proteases), animopeptidase (leucine aminopeptidase), or 
carboxypeptidase (Carboxypeptidase A and B, procollagen C-proteinase), 

DUHP isomerase activity such as peptidyl prolyl cis/trans isomerase activity can be assayed 
by an enzyme assay described by Rahfeld,J.U.,etal. (1994) (FEBS Lett. 352: 180-184). The assay 

10 is performed at 10 *'C in 35 mM BDEPES buffer, pH 7.8, containing chymotrypsm (0.5 mg/ml) and 
DITHP at a variety of concentrations. Under these assay conditions, the substrate, Suc-Ala-Xaa-Pro- 
Phe-4-NA, is in equilibrium with respect to the prolyl bond, with 80-95% in trans and 5-20% in cis 
conformation. An aliquot (2 ul) of the substrate dissolved in dimethyl sulfoxide (10 mg/ml) is added 
to the reaction mixture described above. Only the cis isomer of the substrate is a substrate for 

15 cleavage by chymotrypsin. Thus, as the substrate is isomerized by DITHP, the product is cleaved by 
chymotrypsin to produce 4-nitroanilide, which is detected by it's absorbance at 390 nm. 4- 
Nitroanilide appears in a time-dependent and a DITHP concentration-dependent naanner. 

An assay for DITHP activity associated with growth and development measures cell 
proliferation as the amount of newly initiated DNA synthesis in Swiss mouse 3T3 cells. A plasmid 

20 containing polynucleotides encoding DITHP is transfected mto quiescent 3T3 cultured cells using 
methods well known in the art. The transientiy transfected cells are then incubated in the presence of 
[^H]thymidine, a radioactive DNA precursor. Where applicable, varying amounts of DITHP ligand 
are added to the transfected cells. Incorporation of [^H]thymidine into acid-precipitable DNA is 
measiued over an appropriate time interval, and the amount incorporated is directiy proportional to 

25 the amount of newly synthesized DNA. 

Growth factor activity of DITHP is measured by the stimulation of DNA synthesis in Swiss 
mouse 3T3 cells (McKay, L and L Leigh, eds. (1993) Growth Factors: A Practical Approach , Oxford 
University Press, New York NY). Initiation of DNA synthesis indicates the cells* entry into the 
mitotic cycle and their commitment to undergo later division. 3T3 cells are competent to respond to 

30 most growth factors, not only those that are mitogenic, but also those that are involved in embryonic 
induction. This competence is possible because the in vivo specificity demonstrated by some growth 
factors is not necessarily inherent but is determined by the responding tissue. In this assay, varying 
amounts of DITHP are added to quiescent 3T3 cultured cells in die presence of [^HJtiiymidine, a 
radioactive DNA precursor. DITHP for this assay can be obtained by recombinant means or from 

35 biochemical preparations. Incorporation of [^H] thymidine into acid-precipitable DNA is measured 
over an appropriate time interval, and the amount incorporated is directiy proportional to the amount 



171 



f WO2004/0239^DNA. A linear dose-response curve over at least a hundrt^-...-^^^^" 
of nc.., ..uu.«WDNA oneunitofactivitypernulimteris 
concentrationrangeismdicativeofgrowthfactoractiviiy u „ ino% represents 

defined as theconcentratic.ofDriHPproducinga50%responselevel..berelO^^ 

n^xirnalincorporationofPHlthynudketotoadd-predpi^^^ 

Alternatively.anassayforcytoldneacdvityofDnHP^-theprob^^^^^^^^^ 

leukocytes rnthisassay.thea.ountoftritiatedthy^dineincorporatedintone.lys^^^^ 

leujcocyies. muuo ^» ^fi^rrnp arp added to cultured 

T ^rorinnnfr^thvmidineintoacid-precipitableDNAismeasureu 
over an appropnale tune interval, ana uio oux . ^ . f^i^ niTHP 

concenMnon urns ^ . , nrrHP ™d„m2 a 50% response level, where 100% 

com.emlM«illydefi«daslhe«c»c.na«,<».ofDrrHPptod»mg p 

„ ^,s«d™lbKO,p»««i<»»n'H!<b,».i*ein,o.dd-preci^^^^^ 

M.l»n»avea».,to.Dn«Pcy»tinea=.m.y.aii»saBo,de„n»c»chan*et 

CA) Al.^««o„«37-Cfc80,ol20,»M«,es.*,.fites«efix.dm^*^.ands«»^ 
!^r«.-U^...»-Ce„sw.cl,„.^.e».eo.e,*»,...«»ra»--^^^^^^ 
^d:!loscop,.T.ee— .«d.»tscalcula.^.VdMdl...^.»-^ 

. >, nriTOisoieseMillthelower«ompartn«lJyll»ii»mte'>£"»'8°'°^'^'°°"'°' 

""*:Z.l,.cel,^s„.ss^W<™=^»M.av»- — -•^-^-•'^ 
f„DnHP.c*i„by— ««d.g. aiI.«ed«u-edtoSDSind«P«s»oeotP- 

l.p»e^o,,-elcacldsre^™d^e,...l^ip«^.-.P-'-^^^ 

• Lo„ Pelteareresuspeod=dta20mMttsba»rMpK7.5andn>c»b.«d«d.IW=mG 

30 precipmoon. Pellets nrmp Ate vwshtag, the Sephatose beads Be 

Seph,.os.pre.oa,.dwlth»»tn»d,spe<:rficfaDIIHP. A*"""' ^ ^ 
W,edl..lec.te,hc^issa^eb-fcr...d.hed-.dp».»..ssubl.c.ed^SDS^ 

PAGEls.a^en.d»..itrocell..ose.-*««fcri».m*lo,^,-th.DM^^^^^ 

Ls=db,vis«^«.i,--tifth,gb«.is<»tt«b»>.«sh,8d.e».ibod,s^^^ 

3, rpll^a..rbod,ld.',-Ubel.d:gGspec»e.c..>«^-^y'»-— 

'^J».e«vi,,,sa^byphospl».,.«io.ot.p»«eln.»bs.nte.^^ 



172 



[^^I^S?5^iffi?^^Iititation of the incorporated radioactivity using a radioisoto]^^'E(]2L?ii!^?i2?§??'^is 
incubated with the protein substrate, [^^]-ATP, and an appropriate kinase buffer. The [^^P] 
incorporated into the product is separated firom free [^^P]-ATP by electrophoresis and the 
incorporated [^^P] is counted. The amount of [^^P] recovered is proportional to the kinase activity of 
5 DITHDP in the assay. A determination of the specific amino acid residue phosphorylated is made by 
phosphoamino acid analysis of the hydrolyzed protein. 

La the alternative, DITHP activity is measured by the increase in cell proliferation resulting 
from transformation of a mammalian cell line such as C0S7, HeLa or CHO with an eukaryotic 
expression vector encoding DITHP. Eukaryotic expression vectors are commercially available, and 

10 the techniques to introduce them into cells are well known to those skilled in the art. The cells are 
incubated for 48-72 hours after transformation under conditions appropriate for the cell line to allow 
expression of DITHP. Phase microscopy is then used to compare the mitotic index of transformed 
versus control cells. An increase in the mitotic index indicates DITHP activity. 

In a further alternative, an assay for DITHP signaling activity is based upon the ability of 

15 GPCR family proteins to modulate G protein-activated second messenger signal transduction 

pathways (e.g., cAMP; Gaudm, P. et al. (1998) J. Biol. Chem. 273:4990-4996). A plasmid encoding 
full length DITHP is transfected into a mammalian cell line (e.g., Chinese hamster ovary (CHO) or 
human embryonic kidney (HEK-293) cell lines) using methods well-known in the art. Transfected 
cells are grown in 12- well trays in culture medium for 48 hours, then the culture medium is 

20 discarded, and the attached cells are gently washed with PBS. The cells are then incubated in culture 
medium with or without ligand for 30 minutes, then the medium is removed and cells lysed by 
treatment with 1 M perchloric acid. The cAMP levels in the lysate are measured by 
radioimmunoassay using methods well-known in the art. Changes in the levels of cAMP in* the lysate 
from cells exposed to Ugand compared to those without ligand are proportional to the amount of 

25 DITHP present in the transfected cells." 

Alternatively, an assay for DITHP protein phosphatase activity measures the hydrolysis of P- 
nitrophenyl phosphate (PNPP). DITHP is incubated together with PNPP in HEPES buffer pH 7.5. in 
the presence of 0.1% p-mercaptoethanol at 37 ''C for 60 min. The reaction is stopped by the addition 
of 6 ml of 10 N NaOH, and the increase in tight absorbance of the reaction mixture at 410 nm 

30 resultingvfrom the hydrolysis of PNPP is measured using a spectrophotometer. The increase in tight 
absorbance is proportional to the phosphatase activity of DITHP m the assay (Diamond, R.H. et al 
(1994) Mol CeU Biol 14:3752-3762). 

An alternative assay measures DITHP-mediated G-protein signaling activity by monitoring 
the mobilization of Cs,^ as an indicator of the signal transduction pathway stimulation. (See, e.g., 

35 Grynkievicz, G. et al. (1985) J. Biol. Chem. 260:3440; McColl, S. et al. (1993) J. Immunol. 
150:4550-4555; and Aussel, C. et al. (1988) J. Immunol. 140:215-220). The assay requires 
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. T^^A o TsrPCT/US2003/028227 
preWO 2004/023973iis or T ceUs with a fluorescent dye such as FURA-2 or BC^..x - 
I„uigLgCorp.WestchesterPA)whoseemissic»ch^«isticsarealte^^ When 
fl.ecellsareexposedtooneormoreactivatmgstixnunartificiaUy(e.g..anti^^^ 

theT cellreceptor) or physiologically (e.g., by allogeneic stunulation), Ca- flux takes place. Thas 
5 fluxcanbeobservedandquantifiedbyassayingthecellsinafluorometerorfluorescentactivated 
cell sorter. Measuxen^ents of Ca-flux are con^ between cells in theirnonnal state and those 

transfectedwithDIlHP. McreasedCa-mobilization attributable tom^^^^ 

is proportional to DITHP activity. 

DIIHP transport activity is assayed by measuring uptake of labeled substrates into Xenopus 

,0 l^oocytes. Oocytes at stages V and VI are injected with DimP mRNA (10 ng per ooc>.e) and 
incubated for3daysatl8=C in 0R2 medium (82.5nMNaCl, 2.5 n^KCl, InMCaQ. ImM 
Mga,. ImM Na^O, 5 mM Hepes. 3.8 mM NaOH, 50Mg/ml gentamycin, pH 7.8) to aHow 
expression of DimP protein. Ocytes are then transferred to standard uptake medium (lOOmM 
Naa.2mMKa. ImM CaCl. ImM MgCl. 10 mM Hepes/Tris pH 7.5). Uptake of various 

xs substrates(e.g.,aminoacids.sugars,dn.gs.ions,andneurotransmitters)isinitiatedbya 

substrate (e.g. radiolabeled with ^H. fluorescently labeled with rhodamme. etc.) to the oocytes. After 
incubatingfor30minutes.uptakeistenBinatedbywasMngtheoocytesthreetimesinNa*-free 

^um. measuring the incorporated label, and comparing withcontrols-DiraP^^^^ 

proportional to the level of internalized labeled substrate. 

DTTHP transferase activity is demonstrated by a test for galactosyltransferase activity. Ibis 
can be determined by measuring the transfer of radiolabeled gdactose ftom UDP-^^^^^^ 
GlcNAc-terminated oligosaccharide chain (Kolbinger. F. et al. (1998) J. Biol. Chem. 273:58-65). 
The sample is incubated with 14 ,1 of assay stock solution (180 mM sodium cacodylate, pH 6.5. 1 
n.g/„J bovine serum albumin, 0.26 mMXn)P-galactose.2MlofUDP4'H]galactose^ 1 /ilofMnCl, 
25 (500 mM). and 2.5 ,1 of GlcNAcpO-(CI«.-CO>Ie (37 mg/ml m dimethyl sulfoxide) for 60 mmutes 
at 37 °C The reaction is quenched by the addition of 1 ml of water and loaded on a C18 Sep-Pak 
cartridge (Waters), and the column is washed twice withSml of water toremoveunreactedUDP- 

pmgalactose. Tlie PHJgalactosylated GlcNAcpO<CIW.-CO^e remains bound to the column 
during the water wadies ^d is eluted with 5 ml of methanol. Radioactivity in the eluted material is 
30 ^asuredbyUquidscintiUationcountingandisproportionaltogalactosyto^^^^^ 

Starting sample. 

m the alternative. DTTHP induction by heat or toxins may be demonstrated usmg pnmary 
culturesofhumanfibroblastsorhumanceniinessuchasCa.l3.HBK293.orHEPG2(AT 
heat induceDrrHPexpression.aliquots of cells areincubatedat42<'C for 15,30, or 60 minutes. 
3S Controlanquotsareincubatedat37»C for the same thne periods. To induce DmiP expression by 
toxins, aliquots of ceUs are treated with 100 ,M arsenite or 20 mM azetidine-2-carboxyHc acid for 0. 
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3, (WO 2004/013?73fter exposure to heat, arsenite, or the amino acid analogue, ?£I(y5?&9^^S?^med 
cells are harvested and cell lysates prepared for analysis by western blot Cells are lysed in lysis 
buffer containing 1% Nonidet P-40, 0.15 M NaCl, 50 mM Tris-HCl, 5 mM EDTA, 2 mM 
N-ethylmaleimide, 2 mM phenylmethylsulfonyl fluoride, 1 mg/ml leupeptin, and 1 mg/ml pepstatin. 
5 Twenty micrograms of the cell lysate is separated on an 8% SDS-PAGE gel and transferred to a 
membrane. After blocking with 5% nonfat dry milk/phosphate-buffered saline for 1 h, the membrane 
is incubated overnight at 4°C or at room temperature for 2-4 hours with a 1: 1000 dilution of 
anti-DITHP serum in 2% nonfat dry nodlk/phosphate-buffered saline. The membrane is then washed 
and incubated with a 1:1000 dilution of horseradish peroxidase-conjugated goat anti-rabbit IgG in 2% 

10 dry milk/phosphate-buffered saline. After washing with 0,1% Tween 20 in phosphate-buffered 
saline, the DITHP protein is detected and compared to controls using chemiluminescence. 

Alternatively, DITHP protease activity is measured by the hydrolysis of appropriate synthetic 
peptide substrates conjugated with various chromogenic molecules in which the degree of hydrolysis 
is quantified by spectrophotometric (or fluorometric) absorption of the released chromophore 

15 (Beynon, R.J. and IS. Bond (1994) Proteolvtic Enzymes: A Practical Approach. Oxford University 
Press, New York, NY, pp.25-55). Peptide substrates are designed according to the category of 
protease activity as endopeptidase (serine, cysteine, aspartic proteases, or metalloproteases), 
aminopeptidase (leucine aminopeptidase), or carboxypeptidase (carboxypeptidases A and B, 
procollagen C-proteinase). Conunonly used cliromogens are 2-naphthylamine, 4-nitroaniline, and 

20 furylacrylic acid. Assays are performed at ambient temperature and contain an aliquot of the enzyme 
and the appropriate substrate in a suitable buffer. Reactions are carried out in an optical cuvette, and 
the increase/decrease in absorbance of the chromogen released during hydrolysis of the peptide 
substrate is measured. The change in absorbance is proportional to the DITHP protease activity in 
the assay. 

25 In the alternative, an assay for DITHP protease activity takes advantage of fluorescence 

resonance energy transfer (FRET) that occurs when one donor and one acceptor fluorophore with an 
appropriate spectral overlap are in close proximity. A flexible peptide linker contaming a cleavage 
site specific for PRTS is fused between a red-shifted variant (RSGFP4) and a blue variant (BFP5) of 
Green Fluorescent Protem. This fusion protein has spectral properties that suggest energy transfer is 

30 occurring from BEP5 to RSGFP4. When the fusion protein is incubated with DITHP, the substrate is 
cleaved, and the two fluorescent proteins dissociate. This is accompanied by a marked decrease in 
energy transfer which is quantified by comparing the emission spectra before and after the addition of 
DITHP (Mitra, R.D. et al (1996) Gene 173: 13-17). This assay can also be performed in living cells. 
In this case the fluorescent substrate protein is expressed constitutively in cells and DITHP is 

3 5 introduced on an inducible vector so that FRET can be monitored m the presence and absence of 
DITHP (Sagot, I. et al (1999) FEES Lett. 447:53-57). 
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wo 200m39J?3 determine the nucleic acid binding activity of DITHP invPCmS20p3m8^^^ 
gel mobiUty-shift assay. In preparation for this assay. DITHP is expressed by transf onxring a 
mammalian cell Une such as C0S7, HeLa or CHO with a eukaryotic expression vector containing 
DiraP cDNA. -nie ceUs are incubated for 48-72 hours after transformation under conditions 
appropriate for the ceU line to allov. expression and accumulation of DYTHP. Extracts containing 
solubilized proteins can be prepared ftom cells expressing DITOP by methods well known in the art. 
Portions of the extract containing DIIHP are added to [3^]-labeled RNA or DNA. Radioactive 
nucleicacidcanbesynthesizedinvittebytechniqueswellknownintheart. The mixtures are 
incubated at 25«C in the presence of RNase- and DNase-inhibitors under buffered conditions for 5-10 
minutes. After incubation, the samples are analyzed by poly acrylamide gel electrophoresis followed 
by autoradiography. The presence of a band on the autoradiogram indicates the formation of a 
complex between DTTHP and the radioactive transcript A band of similar mobility will not be 
present in samples prepared using control extracts prepared from untransformed cells. 

In the alternative, a method to determine the methylase activity of a DITHP measures ti^sfer 
of radiolabeled methyl groups between a donor substrate and an acceptor substrate. Reaction 
Huxtures (50 ,1 final volume) contain 15 mM HEPES, pH 7.9, 1.5 mM MgQ,. 10 mM ditiuotiureitol. 
3% polyvinylalcohol, 1.5 ,xCi [m.%Z-^H] AdoMet (0.375 AdoMet) (DuPont-NEN). 0.6 Mg 
DITHP, and acceptor substrate (e.g.. 0.4 ^g I^^SIRNA, or 6-mercaptopurine (6-MP) to 1 mM final 
concentration). Reaction mixtures are incubated at 30 »C for 30 minutes, then 65 for 5 minutes. 
Analysis of [methyl-^m^A is as follows: 1) 50 ^1 of 2 x loading buffer (20 mM Tris-HCl. pH 7.6, 1 
M LiCl, 1 mM EDTA, 1% sodium dodecyl sulphate (SDS)) and 50 ^1 oUgo d(T)-ce]lulose (10 mg/ml 
in 1 X loading buffer) are added to the reaction mixture, and incubated at ambient temperature witii 
shaking for 30 minutes. 2) Reaction mixtures are transferred to a 96-weU filtiration plate attached to a 
vacuum apparatus. 3) Each sample is washed sequentially witii three 2.4 ml aliquots of 1 x ohgo 
d(T) loadmg buffer containing 0.5% SDS. 0.1% SDS. or no SDS. and 4) RNA is eluted witii 300 ,.1 
of water into a 96-well collection plate, transferred to scintillation vials containing liquid scintillant, 
and radioactivity determined. Analysis of [me/M-WMP is as foUows: 1) 500 /il 0.5 M borate 
buffer, pH 10.0, and then 2.5 ml of 20% (v/v) isoamyl alcohol in toluene are added to the reaction 
mixtures. 2) The samples mixed by vigorous vortexing for ten seconds. 3) After centrifugation at 
, 700g for 10 minutes, 1 .5 ml of die organic phase is transferred to scmtillation vials containing 0.5 ml 
absolute etiianol and Uquid scintiUant, and radioactivity determined, and 4) Results are corrected for 
the extraction of 6-MP into the organic phase (approximately 41%). 

An assay for adhesion activity of DITHP measures the disruption of cytoskeletal filament 
networks upon overexpression of DHHP in cultined cell lines (Rezniczek, G. A. et al. ( 1998) J. Cell 
5 Biol. 141:209-225). cDNA encoding DITHP is subcloned into a mammaUan expression vector tiiat 
drives high levels of cDNA expression. This construct is transfected into cultured cells, such as rat 



176 



kaiJ^S J??i/92i^Jx^it bladder carcinoma 804G ceUs. Actin filaments and inteTl£I!il§l9M9JBllch 
as keratin and vimentin are visualized by inmaunofluorescence microscopy using antibodies and 
techniques well known in the art. The configuration and abundance of cytoskeletal filaments can be 
assessed and quantified using confocal imaging techniques. In particular, the bundling and collapse 
5 of cytoskeletal filament networks is indicative of DITHP adhesion activity. 

Alternatively, an assay for DITHP activity measures the expression of DITHP on the cell 
surface. cDNA encoding DITHP is transfected into a non-leukocytic cell line. Cell surface proteins 
are labeled with biotin (de la Fuente, M.A. et al. (1997) Blood 90:2398-2405). Immunoprecipitations 
are performed using DITHP-specific antibodies, and immimoprecipitated samples are analyzed using 

10 SDS-PAGE and immunoblotting techniques. The ratio of labeled immunoprecipitant to unlabeled 
immunoprecipitant is proportional to the amount of DITHP expressed on the cell surface. 

Alternatively, an assay for DITHP activity measures the amount of ceil aggregation induced 
by overexpression of DITHP. In this assay, cultured cells such as NIH3T3 are transfected with 
cDNA encoding DITHP contained within a suitable mammalian expression vector under control of a 

15 strong promoter. Cotransfection with cDNA encoding a fluorescent marker protein, such as Green 
Fluorescent Protein (CLONTECH), is useful for identifying stable transfectants. The amount of cell 
agglutmation, or clumping, associated with transfected cells is compared with that associated with 
untransfected cells. The amount of cell agglutination is a direct measure of DITHP activity. 

DITHP may recognize and precipitate antigen from serum. This activity can be measured by 

20 the quantitative precipitin reaction (Golub, E.S. et aL (1987) Immunologv: A Synthesis , Sinauer 

Associates, Sunderland MA, pages 113-115). DITHP is isotopically labeled using methods known in 
the art. Various serum concentrations are added to constant amounts of labeled DITHP. DITHP- 
antigen complexes precipitate out of solution and are collected by centrifugation. The amount of 
precipitable DITHP-antigen complex is proportional to the amount of radioisotope detected in the 

25 precipitate. The amount of precipitable DITHP-antigen complex is plotted against the serum 

concentration. For various serum concentrations, a characteristic precipitation curve is obtained, in 
which the amoimt of precipitable DITHP-antigen complex initially increases proportionately with 
increasing serum concentration, peaks at the equivalence point, and then decreases proportionately 
with further increases in serum concentration. Thus, the amount of precipitable DITHP-antigen 

30 complex is a measure of DITHP activity which is characterized by sensitivity to both limiting and 
excess quantities of antigen. 

A microtubule motility assay for DITHP measures motor protein activity. In this assay, 
recombinant DITHP is immobilized onto a glass slide or similar substrate. Taxol-stabilized bovine 
brain microtubules (commercially available) in a solution containing ATP and cytosolic extract are 

35 perfused onto the slide. Movement of microtubules as driven by DITHP motor activity can be 
visualized and quantified using video-enhanced light microscopy and image analysis techniques. 
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Dim maSW^m activity is direcfly proportional to the ftequency and velo.,., 

movement 

Alternatively.anassayforDmn'nieasures the formation of protein fflameritsm^^ A 
solution of DIIBP at a concentration greater tium the "criticd concentration" for polymer assembly rs 
; appliedtocarbon^oatedgrids. Appropriate nncleation sites n«y be suppUed in the solution, -^e 
grids are negative stained with 0.7% aqueous uranyl acetate and examined by electron 
microscopy, m appearance of filaments of approximately 25 mn (microtubules), 8 nm (actin). or 10 
mn (intfirmediate filaments) is a demonstration of protein activity. 

DHHP election transfer activity is demonstrated by oxidation or reduction of NADP. 
0 SubstnUessuchasA^-PGal,biocytidine.orubiquinone-10maybeused.Thereactionmix^e 
contamsl-2mg/mlHORP,15 mM substrate. and2.4mMNAD(P)^in0.1Mphosphatebuffer.pH 
7Uoxidationreaction).or2.0mMNAD(P)H.in0.lMNa,HPO.buffer,pH7.4(reduction«^^^^^^^ 
ixratotal volume ofClml. FAD may be included with NAD.according to methods weUknownm 

ti^eaxt Changes in absorbance are measured using a recording spectrophotometer. Theamountof 
,s NAD(P)H is stoichiometrically equivalent to the amount of substrate initiaUy present, and the change 
i.A3«isadirectmeasureoftheamountofNAD(P)Hproduced;AA3« = 6620[NADHl.DmiP 
activity is proportional to the amount of NAD(P)Hpresent in ti.e assay. The increasemextinction 
coefficient ofNAD(P)H coenzyme at 340 nmisameasure of oxidation activity.orti^edecreasem 

extinction coeffxcient of NAD(P)H coenzyme at 340 mn is a measure of reduction activity (Dalzrel. 
20 K. (1963) J. Biol. Chem. 238:2850-2858). 

DITHP transcription factor activity is measured by its ability to stimulate transcription of a 
reporter gene (Liu. H.Y. et al. (1997) EMBO J. 16:5289-5298). Th. assay entails the use of a well 
characterizedreporter gene construct. I.xA,-l-Z.ti,at consists ofl^xADNAtranscrip^ 
controlelements(UxA..)f.sedtosequencesencodingti.eE^LacZenzyme. ^metixodsfor 

,s constructing and expressing fusion genes, introducing ti.em into cells, and measuring LacZ enzyme 
activity,areweUla.own to those skUledintt^e art Sequences encodingDiraP are clonedintoa 

plasmidti«tdirectsti.esynti.esisofafi.ionprotem.l.xA-Dmff.consis^^ 

binding domain derived ftomtiiel^xA transcription factor. mresultingplasmid.encodingaL^^^ 

Dmfffusionprotem, is introduced into yeast ceUs along withaplasmid containing the U^^^^ 
30 reportergene. The amount of LacZ enzyme activity associated witi^LexA-DmiPtransfected cells, 

relative to control cells, is proportional to the amount of transcription stimulated by the DITHP. 

Chromatin activity of DITHP is demonstiated by measuring sensitivity to DNase I (Dawson, 

BA etal (1989) J.Biol. Chem. 264:12830-12837). Samples are treated with DNase I, followed by 

ir^sertion of a cleavable biotinylated nucleotide analog, 5-[(N-biotinamido)hexanoamido-ethyl-l,3. 
3S tiuopropionyl-3-^oaUyll-2'^eoxyuridme S'-triphosphate usmg nick-repair techniques well known 
. tothoseskiUedintheart FoUowing purification and digestion witiiEcoRI restriction endonuclease. 
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bicW92?Pi/p23973ces are affinity isolated by sequential binding to streptavid£^I Jl%?l2?v/2AS?3L. 

Another specific assay demonstrates the ion conductance capacity of DITHP using an 
electrophysiological assay. DITHP is expressed by transforming a mammalian cell line such as 
COS?, HeLa or CHO with a eukaryotic expression vector encoding DITHP. Eukaryotic expression 
vectors are commercially available, and the techniques to introduce them into cells are well known to 
those skilled in the art A small amount of a second plasmid, which expresses any one of a number of 
marker genes such as (3-galactosidase, is co-transformed into the cells in order to allow rapid 
identification of those cells which have taken up and expressed the foreign DNA. The ceUs are 
incubated for 48-72 hours after transformation under conditions appropriate for the cell line to allow 
expression and accumulation of DITHP and P-galactosidase. Transformed cells expressing P- 
galactosidase are stained blue when a suitable colorimetric substrate is added to the culture media 
under conditions that are well known in the art. Stained cells are tested for differences in membrane 
conductance due to various ions by electrophysiological techniques that are well known in the art. 
Untransformed cells, and/or cells transformed with either vector sequences alone or P-galactosidase 
sequences alone, are used as controls and tested in parallel. The contribution of DITHP to cation or 
anion conductance can be shown by incubating the cells using antibodies specific for either DITHP. 
The respective antibodies will bmd to the extracellular side of DITHP, thereby blocking the pore in 
the ion channel, and the associated conductance. 

An assay for DITHP activity measures the expression of DITHP on the. cell surface. cDNA 
encoding DITHP is subcloned into an appropriate mammalian expression vector suitable for high 
levels of cDNA expression. The resulting construct is transfected into a nonhuman cell line such as 
NIH3T3. Cell surface proteins are labeled with biotin using methods known in the art. 
Immuhoprecipitations are performed using DITHP-specific antibodies, and immunoprecipitated 
samples are analyzed using SDS-PAGE and immunoblotting techniques. The ratio of labeled 
Lmmunojprecipitant to unlabeled immunoprecipitant is proportional to the amount of DITHP 
expressed on the cell surface. 

Alternatively, an assay for DITHP activity measures the amount of DITHP in secretory, 
membrane-bound organelles. Transfected cells as described above are harvested and lysed. The 
lysate is fractionated using methods known to those of skill in the art, for example, sucrose gradient 
ultracentrifugation. Such methods allow the isolation of subcellular components such as the Golgi 
apparatus, ER, small membrane-bound vesicles, and other secretory organelles. 
Immunoprecipitations from fractionated and total cell lysates are performed usmg DITHP-specific 
antibodies, and immimoprecipitated samples are analyzed usiug SDS-PAGE and immunoblotting 
techniques. The concentration of DITHP in secretory organelles relative to DITHP in total cell lysate 
is proportional to the amount of DITHP in transit through the secretory pathway. 
XV- Functional Assays 
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mcS^<lMt/Q?3?Z2 by reference. The 5' biased cDNA ends are cloned u^sXi^SJJV^iBSI^JS.^ili^-' 
lactamase gene. 5' cDNA ends harboring inherent characteristics, such as the presence of signal 
peptides or transmembrane domains, when fused in-frame to the beta lactamase C-temainus will 
confer survival when recombinant E. coli clones are grown on antibiotic selective media. Clones 
5 exhibiting antibiotic resistance are sequenced and derived nucleic acid sequences are analyzed for the 
presence of signal peptide or transmembrane regions. 
XVI. Production of Antibodies 

DITHP substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., 
Harrington, M.G. (1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to 
10 immunize rabbits and to produce antibodies using standard protocols. 

Alternatively, the DITHP amino acid sequence is analyzed using LASERGENE software 
(DNASTAR) to determine regions of high immunogenicity, and a corresponding peptide is 
synthesized and used to raise antibodies by means known to those of skill in the art. Methods for 
selection of appropriate epitopes, such as those near the C-terminus or in hydrophilic regions are well 
15 described in the art. (See, e.g., Ausubel, 1995, supra . Chapter 11.) 

Typically, peptides 15 residues in length are synthesized using an ABI 431 A peptide 
synthesizer (Applied Biosystems) using fmoc-chemistry and coupled to KLH (Sigma) by reaction 
with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to increase immunogenicity. (See, 
e.g., Ausubel, supra .) Rabbits are immunized with the peptide-KLH complex in complete Freund's 
20 adjuvant. Resulting antisera are tested for antipeptide activity by, for example, binding the peptide to 
plastic, blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radio- 
iodinated goat anti-rabbit IgG. Antisera with antipeptide activity are tested for anti-DITHP activity 
using protocols well known in the art, including ELISA, RIA, and immunoblotting. 
XVn. Purificatioii of Naturally Occurrmg DITHP Using Specific Antibodies 

2 5 Naturally occurring or recombinant DITHP is substantially purified by inamimoaffinity 

chromatography using antibodies specific for DTTHP. An immunoaffinity column is constructed by 
covalently coupling anti-DITHP antibody to an activated chronoatographic resin, such as 
CNBr-activated SEPHAROSE (Amersham Pharmacia Biotech). After the couplmg, the resin is 
blocked and washed according to the manufacturer's instructions. 

3 0 Media containing DITHP are passed over the immunoaffinity colmnn, and the column is 

washed under conditions that allow the preferential absorbance of DITHP (e.g., high ionic strength 
buffers in the presence of detergent). The column is eluted under conditions that disrupt 
antibody/DITHP binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such 
as urea or thiocyanate ion), and DITHP is collected. 
35 XVm. Identification of Molecules Which Interact with DITHP 

DITHP, or biologically active firagments thereof, are labeled with ^^^I Bolton-Hunter reagent 
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(SeWO 2004/023973V.E. and W.M. HuBter (1973) Biochem. J. 133:529-539.) C^Cimsmysmi] 
prevloui arrayed in the welkofanudti-weUplate are incubated with the laM^ 
and any wells with labeled DIIBP complex are assayed. Data obtained using different 
concentrations of DITHP are used to calculate values for the number, affinity, and association of 
DITHP with the candidate molecules. 

Alternatively, molecules interacting with DITHP are analyzed using the yeast two-hybrid 
system as described in Fxelds. S. and O. Song (1989) Nature 340:245-246. or using commercially 

availableldts based on the two-hybrid system, such as the MATCHMAKER system (CLONTECH). 
DiraP may also be used in the PATHCALUNG process (CuraGen Corp.. New Haven CT) 

which enq,loys the yeast two-hybrid system in a high-throughput manner to determine all inteiactions 

between the proteins encoded by two large Hbraries of genes (Nandabalan, K. et al. (2000) U.S. 

Patent No. 6,057,101). 

All publications and patents mentioned in the above specification are herein incorporated by 
reference. Various modifications and variations of the des^ibed method and system of the invention 
will be apparent to those skiUed in the art without departing from the scope and spirit of the 
mvention. Although the invention has been described m comiection with specific preferred 
embodiments, it should be understood that the invention as claimed should not be unduly limits to 

such specific embodiments, hideed, various modifications of the above-described modes for canymg 
out the invention wmch are obvious to those sldUed in thefield of molecular biology or related fie^ 

are intended to be within the scope of the foUowing claims. 
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What is claimed is: 



CLAIMS 
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1. An isolated polynucleotide comprising a polynucleotide sequence selected from the group 
consisting of: 

a) a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-2722, 

b) a polynucleotide sequence comprising a naturally occurring polynucleotide sequence at 
least 90% identical to a polynucleotide sequence selected from the group consisting of 
SEQ ED NO: 1-2722, 

c) a polynucleotide complen[ientary to a polynucleotide of a), 

d) a polynucleotide complementary to a polynucleotide of b), and 

e) an RNA equivalent of a) through d). 



2. An isolated polynucleotide of claim 1, comprising a polynucleotide sequence selected 
from the group consisting of SEQ ID NO: 1-2722. 

3. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a 
polynucleotide of claim L 



4. A composition for the detection of expression of diagnostic and therapeutic 
polynucleotides comprising at least one of the polynucleotides of claim 1 and a detectable label. 

5. A method for detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 1, the method comprising: 

a) amplifying said target polynucleotide or fragment thereof using polymerase chain 
reaction amplification, and 

b) detecting the presence or absence of said amplified target polynucleotide or fragment 
thereof, and, optionally, if present, the amount thereof. 



6. A method for detecting a target polynucleotide in a sample, said target polynucleotide 
comprising a sequence of a polynucleotide of claim 1, the method comprising: 

a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 
comprising a sequence complementary to said target polynucleotide in the sample, and 
which probe specifically hybridizes to said target polynucleotide, under conditions 
whereby a hybridization coniplex is formed between said probe and said target 
polynucleotide or fragments thereof, and 
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1. of said hybridization complex, and, optionally, it 
b) detecting the presence or absence of said hybnoiza 

present, the amount thereof. 

polynucleotide of claun 1. 

.1 A^od.«P»ducingad.^»«o=».*-P«"*P»'^*"^°^ 



15 

comprising 

2L 



9, and 

20 



9, and , 
polynucleotides of claim 2. 



25 

of claim 13. 



«»„^* polwddc of 01.™ 13. a« o«hod compndng a. s»ps ^. 

™ffici»ttte«»d unto mBl.lec»»dtos for bmdmg;"Bd 
e,iXbi.db«ofd»di.gno.a.-*e»p.n.cpo.,pepdde.o*««».^4 

35 therapeutic polypeptide. 
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16. A microaixay wherein at least one elenaent of the nricroarray is a polynucleotide of claim 



17. A method for generating a transcript iniage of a sample which contains polynucleotides, 
5 the method comprising the steps of: 

a) labeling the polynucleotides of the sample, 

b) contacting the elements of the microarray of claim 16 with the labeled polynucleotides of 
the sample under conditions suitable for the formation of a hybridization complex, and 

c) quantifying the expression of the polynucleotides in the sample. 

10 

18, A method for screening a compound for effectiveness in altering expression of a target 
polynucleotide, wherein said target polynucleotide comprises a polynucleotide sequence of claim 1, 
the method comprising: 

a) exposing a sarnq)le comprising the target polynucleotide to a compound, under conditions 
15 suitable for the expression of the target polynucleotide, 

b) detecting altered expression of the target polynucleotide, and 

c) comparing the expression of the target polynucleotide in the presence of varying amounts 
of the compound and in the absence of the compound. 

20 19. A method for assessing toxicity of a test compound, said method comprising: 

a) treating a biological sample containing nucleic acids with the test compound ; 

b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at 
least 20 contiguous nucleotides of a polynucleotide of claim 1 under conditions whereby 
a specific hybridization complex is formed between said probe and a target 

25 polynucleotide in the biological sample, said target polynucleotide comprising a 

polynucleotide sequence of a polynucleotide of claim 1 or fragment thereof; 

c) quantifying the amount of hybridization complex; and 

d) comparing the amount of hybridization complex in the treated biological sample with the 
amount of hybridization complex in an untreated biological san:q)le, wherein a difference 

30 in the amount of hybridization complex in the treated biological saccule is uidicative of 

toxicity of the test compound. 

20. An array comprising different nucleotide molecules affixed in distinct physical locations 
on a solid substrate, wherein at least one of said nucleotide molecules comprises a first 
35 ' oligonucleotide or polynucleotide sequence specifically hybxidizable with at least 30 contiguous 
nucleotides of a target polynucleotide, said target polynucleotide having a sequence of claim 1. 
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21 Ab may of claim 20, wherein said first oligonucleotide or polynucleotide sequence is 
con^letely complenientary to at least 30 contiguous nucleotides of said target polynucleotide. 



22. Anarray 



of claim 20. wherein said first oUgonucleotide or polynucleotide sequence is 



10 



conq,letely complementary to atleast 60 contiguousnucleotides of said target pol^^^^^^ 

23. An anay of claim 20, which is a microairay. 

24. An array of claim20, further comprising said target polynucleotide hybridized to said 
first oligonucleotide or polynucleotide. 



25. Anarray 
said solid substrate. 



of claim 20. wherein a linker joins at least one of said nucleotide molecules to 



26 Anarrayofclaim20,whereineachdistinctphysicallocationonthesubstratecontams 
„.ultiplenucleotide molecules having the same sequence, andeachdistinctphysicallocationonthe 

substrate contains nucleotide molecules having a sequence which di^s from the sequence of 
nucleotide molecules at another physical location on the substrate. 



20 



25 



30 



27 An isolated polypeptide selected from the group consisting of: 

a) a polypeptide comprising an amino add sequence selected from the gmup consistmg of 

SEQ ID NO:2723-5444, 

b) apolypeptidecomprisinganaturaUy occurringaminoacid sequence at least 90% 

identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO:2723-5444, 

c) abiologically active fragment of a polypeptide having an amino acid sequence selected 
from tiie group consisting of SEQ ID NO:2723-5444. and 

d) an immunogenic fragment of a polypeptide havmg an amino acid sequence selected from 
the group conasting of SEQ ID NO:2723-5444. 

28. An isolated polypeptide of claim 27, comprismg a polypeptide sequence selected from 
the group consisting of SEQ ID NO:2723-5444. 
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^ oligonucleotides, or fragments thereof in diagnostic assays. The invention further provides for vectors and host cells containing 
Q ditph for the expression of DITHP. The invention additionally provides for the use of isolated and purified DITHP to induce antibod- 
^ ies and to screen libraries of compounds and the use of anti-DITHP antibodies in diagnostic assays. Also provided are microarrays 

containing dithp and methods of use. 
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